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OF GENES AND GENOMES 



(57) Abstract 

The present invention relates generally 
to the fields of oligonucleotide synthesis. More 
particularly, it concerns the assembly of genes 
and genomes of completely synthetic artificial 
organisms. Thus, the present invention out- 
lines a novel approach to utilizing the results 
of genomic sequence information by computer 
directed gene synthesis based on computing on 
the human genome database. Specifically, the 
present invention contemplates and describes 
the chemical synthesis and resynthesis of genes 
defined by the genome sequence in a host vec- 
tor and transfer and expression of these se- 
quences into suitable hosts. 
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INSCRIPTION 

MrniQH^^ *™ CHEM1CALS\^QJBESIS_AND 
ASSFMRLY OF f7FNF g CTNOMES 
F ^i^pmmm nv THF INVENTION 

1 Field of the Invention 

The present invention relates generally to the fields of oligonucleotide synthesis. More 
particularly, h concerns «he assembly of genes and genomes of complexly synthetic artificral 

organisms. 

2 Description of Related Art 

Presen, research and commercial applications in molecular biology are based upon 
recombinant DNA devemped in the 1,7,, A criuca, face, of recombinan, DNA is moleeular 
Conmg in plasmids, covered under seminal pa.», o, Cohen and Boyer (U.S. Parent , 40 470 
-Biologically functiona. molecular cameras",. This *aten, teaches a me*od for the cutung 
and splicing" of DNA molecules based upon restriction endonucleases, the introduction of these 
-recombinant" molecules into host cells, and their replication in the bacterial hosts. Tins 
technique is me basis of all molecular Coning fo, research and commercial purposes earned out 
for the pas. 20 years and the basis of the field of molecular biology and geneucs. 

Recombinant DNA technology is a powerful technology, but is limited in utility to 
modifications of existing DNA sequences which are modified mrough 1) restriction en^me 
cleavage sites, 2) PAC primers for amplification, 3) site-speeific mutagenesis, and other 
techniques. The creation of an entirely new molectne, or the subsuntia, modification of 
existing molecules, is extremely time consuming, expensive, requires complex and mutap.e 
steps, and in some eases is impossible. Recombinan, DNA technology does no. p*rm„ the 
creation of endrely artificial molecules, genes, genomes or organisms, bu. only modificauons of 
naturally-occurring organisms. 

Current biotechnology for industrial production, for drug design and development, for 
potential applications of vaccine development and genetic therapy, and for agricultural and 
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5 environ™*, use of recombinan, DNA, depends on naturally-occurring orgamsms and DNA 
molecule, To crea,e or engineer new or nove, function, or ,o modify organisms for 
specialized use (such as producing a human hormone), requires substantially complex time 
consumtng and difficult manipulations of naturallyoccurring DNA molecule, In some cases 
changes to na^lly-occurring DNA are so complex tha, they are no, possible in practice' 
10 Thus, there is a need for technology tha. allows the creation of novel DNA molecules in a 
smgle step without requiring the use of any existing recombinant o, nantrally-occurring DNA. 

SUMMARY Ot TH E INVF.NTIAM 

15 The present invention addresses the limitations in present recombinant nucleic acid 

mampulations by providing a fas,, efficient means for generating ptactically any nucleic acid 
sequence, including entire genes, chromosomal segments, chromosomes and genomes 
Because mis approach is based on « completely synthetic approach, there are no limitations 
such as the availability of existing uucletc acids, to hinder the construction of even vety large 

20 segments of nucleic acid. 

Thus, in a first embodiment, there is provided a method for the construction of a double- 
stranded DNA segment comprising the steps of (i, providing two sets of single-stmnded 
oligonucleotides, wherein (a) the firs, set comprises me entire plus stnand of said DNA 

25 segment (b) the second se. comprises the entire minus strand of said DNA segmen., and (c, 
each of said first se, of oligonucleotides being complement ,o mo oligonucleotides of ^d 
second se, of oligonucleotides, (ii) annealing said firs, and said second se, of oligonucleotides 
and („,) treating said annealed oligonucleotides witi, , ligating enzyme. Optional steps provide 
for the synthesis of the oligonucleotide sets and the tnmsformation of hos, cells Witt, the 

30 resulting DNA segment. 



35 



In parties embodiments, the DNA segment is 100, 200, 300, 40„ 800, 100, 1500 200 
4000, 8000, ,0000, ,2000, ,8,000, 20000, 40,000, 80,000; ,00, 000, 10*. l 0 \ ,0° ,0» or more 
base pairs in , eng th. Indeed, it is contemplated that the methods of the present invention wil, be 
able to create entire artificial genomes of lengths comparable to known bacterial, yeast viral 
mammalian, amphibian, reptilian, avian genome, In more particular embodiments, the' DNA 
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5 segmen. ,s a gene ending a pro,* of interes, The DNA segmen, furuter may include „on- 
JL, Cements such as origins of replicion, .e.omeres, promo,ers, e^cs —on 
m d l,ario„ « and s,„p signais, in.ro., exon spiice si,es, chroma,i„ scaffo, component 
and omer ^atory fences. The DNA segmen, ma, copses mul-ple genes 
c _a, segments, chromosomes and even entire genomes. The DNA segments ^ 

,0 derived from prokaryotic or euka^ouc seouences inciuding baceria,, yeas,, vrral, mammahan, 
amphibian, repri.ian, avian, pianrs, archebacteria and o,her DNA confining iiving organs. 

The „,igo„uc.eo,ide sers preferably are comprised oligonucleotides of between abou, 15 
^ 1M bases and more preferabiy between about 20 and 50 bases. 
15 but are notlimited ,0 .5, ,6, ,7, 18, ,9, 20, 2,, 22, 23, 24, 25,*, 27, 28, 9, , , , , , 
35 36 37 38, 39, 40, 4,, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 5 , 9 
60 6^' 62, 63, 64. 65, 66, 67, 68,69, 70, 71,72, 73, 74, 75, 76,77, 78, 79, 80, 81, 82, 83, 84, 
^ 8, 88, 89, 90, 91, 92, 93, 94, 95, ,6, 97, 98, 99 and ,00. Depending on the s 12 e * 
Hp between ,e ohgonuciecides of Ore .wo se.s may be desrgned .0 be be,wee„ 5 and 75 
20 bases per oligonucleotide pair. 

me o,igo„uc,e„,ides prefer»b,y are .reared wim ^nuCeoude kinase, for exampie, T4 
polynucleotide kinase. The basing - be formed prior .0 mixing o, me oligonucleobdes 
e, or alter, bo. before anneaiing. After annealing, me oligonuCeoudes are 
25 Zha^aiigaringfoncion. For example, a DNA ligase typically will be 610 

r^on.Hov.ve,,^^^ 

operates at room temperature, and may be used instead of lrgase. 

,„ a second embodiment, mere is provided a method for construction of a double- 
30 ^ded DNA segment comprising the steps of (i) providing two - * «^ 
„,ig„„uc.eo.ides, wherem (a) me firs. se. comprises .he entire plus stmnd of sad DNA 
^en., (b) dte second se, comprises the entire minus s,ra„d of said DNA segmen,, and (c 
1 of ^ L se. of oligonuc,eo.ides being complement .0 « oligonueleo..^ of sard 
Ld se. of o,igonuc,e„.ides, (ii, — pairs of complement oligonucleo « » 
35 produce a se, of ft. annexed produce wheretn each pair comprises an olrgonudeoude *» 
I of said f,rs, ^ said second sea of o,igo»uc,eo,ides, (iii) annealing pairs of frs. annealed 
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5 products having complementary sequences to produce a set of second annealed products, (iv) 
repeating the process until all annealed products have been annealed into a single DNA 
segment, and (v) treating said annealed products with ligating enzyme. 

In a third embodiment, there is provided a method for the construction of a double- 
10 stranded DNA segment comprising the steps of (i) providing two sets of single-stranded 
oligonucleotides, wherein (a) the first set comprises the entire plus strand of sand DNA 
segment, (b) the second set comprises the entire minus strand of said DNA segment, and (c) 
each of said first set of oligonucleotides being complementary to two oligonucleotides of said 
second set of oligonucleotides, (ii) annealing said the 5' terminal oligonucleotide of said first 
15 set of oligonucleotide with the 3' terminal oligonucleotide of said second set of 
oligonucleotides, (iii) annealing the next most 5' terminal oligonucleotide of said first set of 
oligonucleotides with the product of step (ii), ( jv ) annealing the next most 3' terminal 
oligonucleotide of said second set of oligonucleotides with the product of step (iii), (v) 
repeating the process until all oligonucleotides of said first and said second sets have been 
20 annealed, and (vi) treating said annealed oligonucleotides with ligating enzyme. Optional steps 
provide for the synthesis of the oligonucleotide sets and the transformation of host cells with 
the resulting DNA segment. In a preferred embodiment, the 5' terminal oligonucleotide of the 
first set is attached to a support, which process may include the additional step of removing the 
DNA segment from the support. The support may be any support known in the art, for 
25 example, a microtiter plate, a filter, polystyrene beads, polystyrene tray, magnetic beads, 
agarose and the like. 



30 



Annealing conditions may be adjusted based on the particular strategy used for 
annealing, the size and composition of the oligonucleotides, and the extent of overlap between 
the oligonucleotides of the first and second sets. For example, where all the oligonucleotides 
are mixed together prior to annealing, heating the mixture to SOT, followed by slow annealing 
for between 1 to 12 h is conducted. Thus, annealing may be conducted for about 2, about 3, 
about 4, about 5, about 6, about 7, about 8, about 9, or about 10 h. However, in othei 
embodiments, the annealing time may be as long as 24 h. 
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W* me aid of a compmer, .he inven,or is * ,o « synthesis o f . vecor/gene 
combinahon nsing a high .nroughpu, oligonucieoude syn,hes*r as a se, of over^ppmg 

J* — - .he assembly - »* ■*» " " 
,ra„s ta a,ion i„,o a sui«ab.e host strain. ,n a pa** en—, ft. ,nven„on enera.es a 

L genome. ,„ oihe, embodiment, a yeas, o, — s expression vecor sys,em - * 
1 mpl ,ed ,o aiiow expression of each gene ,n a — , reg,„n in a e„ k aryo c hos 
, Iher embodimen, i. .he presen, inve„,o„ aiiows one of *. in .he * » d- 
Meaner gene" SM ,egy wherein a gene or genomes or vir-y any snncture may he read,, 
15 ^d,U-dandexp,essed. Thns.even-y^.echnoiogydesenhedherernn.yhe 
IJoyed ,0 crea,e e„,ire genomes for in— in.o hos, eeiis for *. — - 
artificial designer living organisms. 

,„ specific embodiment ,he presen, inven.ion provides a mernC fo, .he syndesis of a 
20 ^a.—, donblcs-randed poiynneie.de 

an origin of replication, a first coding region and a first regula ry 
expression of the first coding region. 

A ddi.iona„y me memod may ftrmer compHse <he s,ep of amplifying .he do*- 
25 ^ p o,ynnc,eo,ide. in spec* emhodimen,s, me doob^and* poiyn— 
oompHses iOO, ,00, 300, 5* .0, W «* ,00, ,000, 5 000 , 0 x > > . » ■ 

,„> 40 x ,0' 50 x Itf, 60 x 10', 70 x >0>, 80 x 10>, 90 * 10>, 1 x .0 , 1 x 10 , 1 x !0 , 10 , 
J • ,0' o, 1 x .0" base pairs m lengu, The « regulatory eiemen, may he a 
^ In certain embodiment *. double-snanded polynucieolde mrfcer 
3 „ Lond regu,a,ory eiemen, .he second re g u,a,ory eiemen, heing a 

ya further embodiment me doubie-suanded po,ynuc,eo,ide composes a plummy f cod»g 
regions and a plurali,, of regu,.,ory element Specificaily, i. is co„.empla.ed « *. codm 

biochemical pamway is glycolysis. More particuiarly, i. is con,emp,a,ed 
35 regions encode enzymes seleced from me group c„nsis,ing of hexoKrnase ***** 
lerase, phosphof—e,, aldo!ase, —pha* isomerase, giyceraldehyde-,- 
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5 phosphate dehydrogenase, phosphoglycerate kinase, phosphoglycerate mutase, enolas. and 
pyravale kinase enzymes of the glycolytic pathway. 

in other embodiments, the biochemical partway is lipid synthesis, cofactor synthesis 
Parttcularly contemplated are synthesis of lipoic acid, riboflavin synthesis nucleotide synthesis 
1 0 the nucleotide may be a purine or a pyrimidine. 

to certam other embodiments it is contemplated mat the coding regions encode enzymes 
tnvolved ,„ a cellular process selected from the group consisting of cell division, chaperon, 
detoxification, peptide secretion, energy metabolism, regulator, ftrnction, DNA replication' 
transcnpfon, RNA processing and tRNA modification. ,„ preferred embodiments, the energy' 

metabolism is oxidative phosphorylation. 



It 1S contemplated that the double-stranded polynucleotide is a DNA or an RNA In 
preferred embodiments, the double-stranded polynucleotide may be a chromosome The 
double-stranded polynucleotide may be an expression const™, Specifically, the expression 
construct may be a bacterial expression construct, a mammalian expression construct or a viral 
expression construct. In particular embodiments, the double-stranded polynucleotide comprises 
a genome selected from the group consisting of bacterial genome, yeast genome, viral genome 
mammalian genome, amphibian genome and avian genome 

25 

In those embodiments in which the genome is a viral genome, the viral genome may be 
selected from the group consisting of retrovirus, adenovirus, vaccinia virus, herpesvirus and 

adeno-associated virus. 

30 Tie present invention farther provides a method of producing a viral particle. 

Another embodtmen, provides a method of pricing an artificial genome, wherein the 
chromosome compos all coding regions ar* regulatory elements fotmd in a corresponding 
natural chromosome. ,„ ^ mMmc ^ fc ^ ^ ^ 

human mitochondria, genome. In other embedments, the corresponding ^ clromMome 

is a chloroplast genome. 
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Also provided is a method of producing an artificial genetic system, wherein tie sysfcm 
compnses all coding regions and regulatory elements found in a corresponding na.t*al 
biochemical pathway. Such a biochemical pathway will likely possess a group of enzymes fta, 
serial* metabolize . compound. In particular,, preferred emblems, the biochemtcal 
,0 pathway comprises the activities reouired for glycolysis. In other embodiments, the 
biochemical pathway comprises the enzymes reouired for electron tmnspor, In sttll further 
embodiments, the biochemical pathway comprises the enzyme activities reourred for 
photosynthesis. 

,« Other objects, features arrd advantages of the present invention will become apparent 

from the following detailed description. 1. should be understood, however, that the detatled 
description and the specific examples, while indicating preferred embodiments of the i— , 
are given by way of illustration only, since various changes arf modifications within the sptrt, 
and scope of the invention will become apparent ,0 those skilled in the ar, from tlus defied 

20 description. 

PPTFF mrsrn 1PTION OF THF DRAWINGS 

The following drawings form part of the present specification and are included to 
25 further demonstrate certain aspects of the present invention. The invention may be better 
understood by reference to one or more of these drawings in combination with the detaded 
description of specific embodiments presented herein. 

FIG. 1. Flow diagram of the Jurassic Park paradigm for the construction of 
30 synthetic organisms and reassembly of living organisms. 

FIG. 2. Flow diagram of the strategy of synthetic genetics and assembly of 

organisms. 

35 FIG. 3. Flow diagram of the eight-step strategy for combinatory assembly of 

oligonucleotides into complete genes or genomes. 

RECTIFIED SHEET (RULE 91) 
ISA/EP 
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FIG. 4A-FIG. 4C. Design of plasmid synlux4. The sequence of 4800 is annotated 
with the locations of lux A+B genes, neomycin/kanamycin phosphotransferase and pUC19 
sequences. 



FIG. 5A-FIG. 5F. List of component oligonucleotides derived from the sequence of 
Synlux4 in Figure 4A-FIG. 4C. 



FIG. 6A-FIG. 6B. Schema for the combinatory assembly of synthetic plasmids 
from component oligonucleotides. 

15 

FIG. 7A-FIG. 7G. SynGene program for generating overlapping oligonucleotides 
sufficient to reassemble the gene or plasmid. 

DESCRIPTION OF II LUSTRATTVF FM B nn,M Fm 

The complete sequence of complex genomes, including the human genome, make large 
20 scale functional approaches to genetics possible. The present invention outlines a novel 
approach to utilizing the results of genomic sequence information by computer directed gene 
synthesis based on computing on the human genome database. Specifically, the invention 
descnbes chemical synthesis and resynthesis of genes for transfer of these genes into a suitable 



host cells. 
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The present invention provides methods that can be used to synthesize de novo, DNA 
segments that encode sets of genes, either naturally occurring genes expressed from natural or 
aibfical promoter constructs or artificial genes derived from synthetic DNA sequences, which 
encodes elements of biological systems that perform a specified function or attribution of an 
arufiaal organism as well as entire genomes. In producing such systems and genomes the 
30 present invention provides the synthesis of a replication-competent, double-stranded 
polynucleotide, wherein the polynucleotide has an origin of replication, a first coding region 
and a first regulatory element directing the expression of the first coding region. By replication 
competent, it is meant that the polynucleotide is capable of directing its own replication Thus 
it is envisioned that the polynucleotide will possess all the c/s-acting signals required to' 
35 faahtate its own synthesis. In this respect, the polynucleotide will be similar to a plasmid or a 

RECTIFIED SHEET (RULE 91) 
ISA/EP 
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5 virus , such .hat once Placed withtn a cell, i. is able . be rep,ica,ed by a combination of fte 
polynucleotide's and cellular functions. 

Thus using «he .echniques of.be present invention, one of ski!, in the an can create an 
artificial genome that is capable of encoding al. ,he ac,ivi,ies reared for sustatning i,s own 

,0 existence. Also co„.em pl a,d are artificial genetic sys,ems .ha. are cap* of 

enzymes and activities of a particular biochemical pathway, .n such a sys<em, ,. wdl be 
desire ,o have all fte ac.ivi.ies presen, such fta. fte wbo.e biochemical pa.bway wtll 
operate. Ue co-expression of a se. of enzymes required for a particular pathway consuls a 
comp.ete genetic or bidogical system. For example, fte co-expression of fte er^es 

,5 involved in glycols cons,i.u.es a complete genetic sys.cn, for fte production of energy m .be 
form of ATP from giucose. Such sys.ems for energy production may include groups of 
enzymes which naturally or artificially serially metabolize a se. of compounds. 

The types of biochemical pathways would indude but arc no. W « *- for fte 
20 biosynthesis of cofacors prosthetic groups and carriers (lipoate synthesis, riboftavm synthesrs 
pyridine nucleotide synftesis); fte biosynthesis offtc cell envelopes (membranes, bp.pro.etns, 
porins. surface polysaccharides, Upopolysacchartdes, arttigens and surface stnrcmres); cellular 
„ including eel, division, chaperones, delineation, protein secretion, cental 
intermediary metabolism (energy production vi phosphorus compounds and ofter); energy 
25 metabolism including aerobic, anaerobic, ATP proton motive force intercom— electron 
tiansport, glycolysis triose phosphate pathway, py^vate dehydrogenase, sugar me^hsm; 
purine, pyridine nucleotide synthesis, inducing ^ribonucleotide synftesis, nuc eotide 
and nucleoside interconversion, salvage of nucleoside a*d nucleotides, sugar-nucleotide 
biosynftesis and convercion; regulatory fimctions includtng transcriptional and uanslation*. 
30 conuols, DNA repletion including degradation „, DNA, DNA replication 

modification, recombination attd repair, transcription including degradation of DNA, DNA- 
dependent RNA polymerase * ascription factors; RNA processing; translation tncludmg 
amino acyl .RNA syn.heu.ses, degradation of peptides and glycopeptides, protein modtficatior, 
ribosomc synftesis and modification, tRNA modification; Nation factors transport and 
35 binding proteins including aatino acid, peptide, amine carbohydrau, organic atohol, orgamc 
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5 acid and cation transport; and other systems for the adaptation, specific function or survival of 
an artificial organism. 



A. Definitions 

DNA segment - a linear piece of DNA having a double-stranded region and both 5'- 
10 and 3'-ends; the segment may be of any length sufficiently long to be created by the 
hybridization of at least two oligonucleotides have complementary regions. 

Oligonucleotides - small DNA segments, single-stranded or double-stranded 
comprised of the nucleotide bases A, T, G and C linked through phosphate bonds; 
1 5 oligonucleotides typically range from about 1 0 to 1 00 base pairs. 

Plus strand - by convention, the single-strand of a double-stranded DNA that starts 
with the 5' end to the left as one reads the sequence. 

20 Minus strand - by convention, the single-strand of a double-stranded DNA that starts 

with the 3' end to the left as one reads the sequence. 

Complementary - where two nucleic acids have at least a portion of their sequences, 
when read in opposite ( 5'->3'; 3'->5«) direction, that pair sequential nucleotides in the 
25 following fashion: A-T, G-C, T-A, G-C. 

Oligonucleotide sets - a plurality of oligonucleotides that, taken together, comprise the 
sequence of a plus or minus strand of a DNA segment. 

30 Annealed products - two or more oligonucleotides having complementary regions, 

where they are permitted, under proper conditions, to base pair, thereby producing double 

stranded regions. 
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B The Present Invention 

The present invention describes methods for enabling the creation of DNA molecules, 
genomes and entire artificial living organisms based upon information only, wtthout the 
requirement for existing genes, DNA molecules or genomes. 

The methods of the present invention are diagrammed in FIG. 1 and FIG. 2 and 
generally involve the following steps. Generally, using simple computer software, compnstng 
sets of gene pam and functional elements i, is possible to — a vir.ua, polynucleot.de rn 
the computer. Tnis polynucleotide consists of a string of DNA bases, G, A, T or C, compnsmg 
for example an entire artificial genome in a liner string. For transfer of the synthetic gene mto 
j for example, bacterial cells the polynucleotide should contain the sequence for a bactenal (such 
as PBR322) origin of replication. For trartsfer into ettayotic cells, it should contam the ongtn 
of replication of a mammalian virus, chromosome or subcellular comment such as 
mitochondria. 

, 0 Following construction, simple computer software is men used to break down the 

genome sequence into a se, of overlapping oligonucleotides of specified length. Tbis results m 
a * of shorter DNA sequences which overlap to cover the entire genome in overlapptng set. 
Typically a gene of 1000 bases pairs would be broken down into 20 100-mers where 10 of 
«. comprise one strand and 10 of these comprise the other strand. They would be selected to 

25 overlap on each strand by 25 to 50 base pairs. 

This step is followed by direction of chemical synthesis of each of the overlapping se, of 
oligonucleotides using an array type synthesize, and phosphoamidite chemistry siting m an 
array of synthesized oligomers. The next step is to balance concentration of each oligomer and 
30 pool the oligomers soma, a single mixture contains equal concentrations of each. Themtxed 
oligonucleotides are treated with T4 polynuc,eo.ide kinase «o 5' phosph„ry.a.e the 
oligonucleotides. The nex. s.ep is .o carry ou, a "slow- annealing step .0 co-annea, all of the 
oligomers in,o the sequence of the predicted gene or genome. Tbis is done by heating 4e 
mixture to 80°C, to allowing it to cool slowly to room temperaure over several hours. The 
35 mixture of oligonucleotides is then treated with T4 DNA ligase (or alternatively topoisomerase) 
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5 to join the oligonucleotides. The oligonucleotides are then transferred into competent host 
cells. 
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The above technique represents a "combinatorial" assembly strategy where all 
oligonucleotides are jointly co-annealed by temperature-based slow annealing. A variation on 
this strategy, which may be more suitable for very long genes or genomes, such as greater than 
5,000 base pairs final length, is as follows. Using simple computer software, comprising sets of 
gene parts and functional elements, a virtual gene or genome is constructed in the computer. 
This gene or genome would consist of a string of DNA bases, G, A, T or C, comprising the 
entire genome in a linear string. For transfer of the synthetic gene into bacterial cells, it should 
contain the sequence for a bacterial (such as pBR322) origin of replication. 

The next step is to carry out a ligation chain reaction using a new oligonucleotide 
addition each step. With this procedure, the first oligonucleotide in the chain is attached to a 
solid support (such as an agarose bead). The second is added along with DNA ligase, and 
annealing and ligation reaction carried out, and the beads are washed. The second, overlapping 
oligonucleotide from the opposite strand is added, annealed and ligation carried out. The third 
oligonucleotide is added and ligation carried out. This procedure is replicated until all 
oligonucleotides are added and ligated. This procedure is best carried out for long sequences 
using an automated device. The DNA sequence is removed from the solid support, a final 
ligation (is circular) is carried out, and the molecule transferred into host cells. 



Alternatively, it is contemplated that if the ligation kinetics allow all the 
oligonucleotides may be placed in a mixture and ligation be allowed to proceed. In yet another 
embodiment, a series of smaller polynucleotides may be made by ligating 2, 3, 4, 5, 6, or 7 
30 oligonucleotides into one sequence and adding this to another sequence comprising a similar 
number of oligonucleotides parts. 



The ligase chain reaction ("LCR"), disclosed in EPO No. 320 308, is incorporated 
herein by reference in its entirety. In LCR, two complementary probe pairs are prepared, and in 
35 the presence of the target sequence, each pair will bind to opposite complementary strands of 
the target such that they abut. In the presence of a ligase, the two probe pairs will link to form a 
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5 single unit. By temperature cycling, as in PGR™, bound ligated units dissociate from the target 
and then serve as "target sequences" for ligation of excess probe pairs. U.S. Patent 4,883,750 
describes a method similar to LCR for binding probe pairs to a target sequence. The followmg 
sections describe these methods in further detail. 

10 C. Nucleic Acids 

Thepresentinventiondiscloseslhe artificial synthesisof genes. In one embodiment of the 
present invention, the artificia, genes can be transferred into ceils to confer a particular function 
either as discrete units or as par, of artificial chromosomes or genome. One will general., prefer 
to design oligonncleotideshavingsuetchesof 15 to 100 nucleotides, 25 to 200 nudeotidesor even 

15 l0 „ger where destred. Such fragments may be readily prepared by, directly synthestzmg the 
fragment by chemical means as described below. 

Accordingly, the nucleotide sequences of the invention may be used for their ability to 
selectivelyformduplexmoleculeswithcomplementarystretchesofgenesorRNAsortoprov.de 

20 pnnWoramplifu^^ 

one will desire to employ varying conditions of hybridization to achieve varying degrees of 
hybrizationselectivity. Typically high selectivity is favored. 

For applications requiring high selectivity, one typically will desire to employ relatively 
25 stringent conditions to form the hybrids, e.g., one will select relatively low salt and/or h lg h 
temperaturecondi^ 

of about 50°C to about 70°C. Such high stringency conditions tolerate little, if any, nnsmatch 
between the oligonucleotide and the template or target strand. It generally is appreaated that 
conditionscanberenderedmores^^^ 

30 

For certain applications, for example, by analogy to, substitution of nucleotides by stte- 

directedmutagenesiM. is appreciated that lower stringency conditions may be used. Under these 

condi.ions.hybridizationmay occur even though the sequences of probe and targe, strand are no. 

perfectly complementary, bu, are mismatched a, one o, more positions. Conditions may be 

35 renderedlessstringentbyincreasingsal,^^ 

amediumsrtngencyconditioncou.d^providedbyabou.O.l.oO^MNaClat.emperaturesof 
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5 about 37°C to about 55°C, while a low stringency condition could be provided by about 0. 1 5 M to 
about 0.9 M salt, at temperatures ranging from about 20°C to about 55°C. Thus, hybridization 
conditions can be readily manipulated depending on the desired results. 

In certain embodiments, it will be advantageous to deteriming the hybridization of 
10 ilogonucleotides by employing as a label. A wide variety of appropriate indicator means are 
known in the art, including fluorescent, radioactive, enzymatic or other ligands, such as 
avidin/biotin, which are capable of being detected. In preferred embodiments, one may desire to 
employ a fluorescent label or an enzyme tag such as urease, alkaline phosphatase or peroxidase, 
instead of radioactive or other environmentally undesirable reagents. In the case of enzyme tags, 
15 colorimetric indicator substrates are known that can be employed to provide a detection means 
visible to the human eye or spectrophotometrically,to identify whether specific hybridization with 
complementary oligonucleotidehas occured. 

In embodiments involving a solid phase, for example the first oligonucleotide is adsorbed 
20 or otherwise affixed to a selected matrix or surface. This fixed, single-stranded nucleic acid is 
then subjected to hybridization with the complementary oligonucleotides under desired 
conditions. The selected conditions will also depend on the particular circumstances based on the 
particular criteria required (depending, for example, on the G+C content, type of target nucleic 
acid, source of nucleic acid, size of hybridization probe, etc.). Following washing of the 
25 hybridized surface to remove non-specifically bound oligonucleotides, the hybridization may be 
detected, or even quantified, by means of the label. 

For applications in which the nucleic acid segments of the present invention are 
incorporated into vectors, such as plasmids, cosmids or viruses, these segments may be combined 
30 with other DNA sequences, such as promoters, polyadenylation signals, restriction enzyme sites, 
multiple cloning sites, other coding segments, and the like, such that their overall length may vary 
considerably. It is contemplated that a nucleic acid fragment of almost any length may be 
employed, with the total length preferably being limited by the ease of preparation and use in the 
intended recombinant DNA protocol. 

35 



PCT/US98/19312 

WO 99/14318 15 

, DNA segments encoding a specific gene may be introduced — «— h ° sl «* 

m d employedfor expressing, specific struck or regula,orypro,ein. Alterna,ivc>y,,hrough,h= 

employed. Upstream reg,ons confining ,egula,ory regions such as promoter regions may be 
isolated and subsequently employed for expression of the selected gene. 

The nueleie acids employed may encode antisense constructs that hybridize, under 
innacellular conditions, to a nucleic acid of interest. The term -anfsense construe,- is intended 
„ refer to nucleic acids, preferably oligonucleotides, that are complementary to the base 
sequences of a targe. DNA. Antisense oligonucleotides, when introduced into a target cell, 
,5 specifically bind to their target nucleic acid and interfere with transcription, RNA „ 
Isport, translation andVor stabihty. Antisense — ma, be desrgned to bmd to 
promoter and other con«o. regions, exons, in M s or even exon-intron boundaries of a gene. 

Other sequences with lower degrees of homology also are contemplated. Fo, example, 
» an anusense construe, which has limited regions of high homology, but also contains a non- 
homologous region (,,, a ribozyme, couid be designed. These morecules, though havmg less 
ftan 50% homology, would bind to urge, sequences under appropriate condtttons. 

,„ certain embodiments, cue may wish to empioy antisense constructs which include 
J5 other elements, fo, example, those which include C5 propyne pyridines. Oligonucleohdes 
which contain C-5 propyne analogues of uridine and cytidine have been shown to bmd « 
with high affinUy and ,o be po,e„t antisense inhibiiors of gene expression (Wagner „ a,., 
1993). 

30 According ,o the present invention, DNA segments of a variety of sizes will be 

produced. These DNA segme»,s will, by definition, be linear molecnes. As sue , they 
wpically will be modified before further use. These modifications include, in one embody 
Z reslcuon o, the segment ,o p.*. - or more "sue., ends" compatible - 
complement ends of other molecules, inCuding those in vectors capable of sup^rttng me 
35 replication of the DNA segment. This manipulation facilitates "cloning" of the segments. 
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Typically, cloning i„ volves ^ use of restrictjon ^ ^ ^ 

particular si.cs within DNA strands, ,„ prepare a DNA segment for transfer ml0 . clorjing 
vch,cle. Liga,io„ of ,he compatible ends (which include blun, ends) nsing a DNA iigase 
computes the reaction. Depending 0 „ the situation> „. ^ ^ ^ ^ 

relahvely small portion of DNA, compared ,o the insert. A „em a , iv e, v , the clo „ ing ^ 
b. extremely comp leii m d WlJe . variety „ f featllr es that will affect the replication and 
tocion o, the DNA segment. ,„ certain embodiments, a rare cutter site may be introduced 
mto the end of the polynucleotide sequence. 

Coning vehicles include plasmids such as the P UC series, Bluescrip,™ vectors and a 
vanety of other vehicles with multipurpose clonmg sites, selecuble marke* and origins of 
rep cation. Because of the nature of the present invention, the Coning vehides may include 
such complex molecules as phagemids and cosmids, which hold relatively ,arge pteces of DNA 
In addmon, the generation of artificial chromosomes, and even genomes. 

20 Following cloning into . suitable vector, the construe, then is Werred into a 

compare host ce.1. A variety of different gene transfer technic^ a. described elsewhet* in 
ft. ocument. Cuhure of me host cells for the intended purpose (amplification, expression 

subclomng) follows. 

* Throughout this apphcation, the term "expression construct- is mean, ,„ include a 

pabular kind of doning vehicle containing . nucleic acid coding for a gene product in which 
par, or all of the „„c,eic acid encoding seuuence is capable of being transcribed. The hanscrip, 
may be translated into a pro,ein, bu, i, need no, be. Thus, in cer* embodiments, expression 
mdudes bom «— of a gene artd translation of a RNA into a gene pre*.. !n other 
embodtments, expression only includes transcription „f te nucleic acid _ for 
generate antisense constructs. 

1" preferred embodiments, the nucleic acid is under trattscriptional com™, „f , 
promote, A "promoter" refers ,„ . DNA seouenee recognized by the synthetic machinery of 
ft. ce„, or induced synthetic machinety, reouired to initiate the specific transcription of a 
gene. The phrase "under tnmscriptiona, control" means tha, the promoter is in the correct 
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expression of the gene. 

The term promoKr wi,i be used here to refer to a group of transcript- con.ro, 
mo du,e L are iustered around the initiation s,te for RNA po.yme.ae 0. Much of the 
„ « about how promoters are orga^ deHves from a»a, y ses of Severn, vira, prom*. 
Idi those for the HSV thymidine kinase <tk) and SV40 ea* —on urn, These 
II augmented bv more recent work, have shown tha, promoters are composed of dtscre 

m „re recognition sites for transcriptional activator or repressor protetns. 



Al ,eas, one rnoduie m each promoter functions ,0 position the start sue or RNA 
thesis The best known examp,e of this ,s the TATA box, bu, in some promoters . 
TI T ox such as the promoter for the mammaiian terminal deoxynucleotidy, — 

20 helps to fix the place of initiation. 

i ♦ a. fr^nnencv of transcriptional initiation. 
Additional promoter elements regulate the frequency 

• m lift hr> uostream of the start site, although a 
Tvnicallv these are located in the region 30-110 bp upstream 

ILr i promoters have recent, been shown to contain — e.ements do_ 
2S tiTstar, * - we, The spacing between promoter Cements « - » « 

L * promoter, me spacing baween promoter Cements can be tncrcased to 50 bp *art 
taction either cooperatively or independenti, to activate trtmscnption. 

The particmar promoter tha, is empfoyed to control the expression * 
„o, believed to be ctiticai, so >ong as i, is capabie of expressing the nuCeic actd m me targ ted 
I 1 where a human ce„ is targeted, i, ,s prefer* » position the nucietc ac,d codtn 
: ior ^ cent to and under tite contro, of a promoter ma, is capabie of being express, - . 
„ Ilan c , CeneraUy speakin, such a promoter might inciude etther a human or 
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5 promote, Preferred promoters include those derived from HSV. Another preferred 
embodiment is the tetracycline controlled promoter. 

In various other embodiments, the human cytomegalovirus (CMV) immediate early 
gene promoter, the SV40 eaHy promoter and the Rous sarcoma virus long terminal repeat can 
10 be used to obtain high-level expression of transgenes. The use of other viral or mammalian 
cellular or bacterial phage promoters which are we.l-known in the art to achieve expression of a 
transgene 1S contemplated as well, provided that the levels of expression are sufficient for a 
g.ven purpose. It is envisioned that any elements/promoters may be employed in the context of 
the present mvention. Below is a list of viral promoters, cellular promoters/enhancers and 
15 inducible promoters/enhancers that could be used in combination with the nucleic acid 
encodmg a gene of interest in an expression construct. Enhancer/promoter elements 
contemplated for use with the present invention include but are not limited to Immunoglobulin 
Heavy Cham, Immunoglobulin Light, Chain T-Cell Receptor, HLA DQ a and DQ p p. 
Interferon, Interleukin-2, Interleukin-2 Receptor, MHC Class II 5, MHC Class II HLA-DRa P - 
20 Actm, Muscle Creatine Kinase, Prealbumin (Transthyretin), Elastase /, Metallothionein 
Collagenase, Albumin Gene, a-Fetoprotein, x-Globin, P -Globin, e-fos, c-HA-ras, Insulin, Neural' 
Cell Adhesion Molecule (NCAM), a 1 -Antitrypsin, H2B (TH2B) Histone, Mouse or Type I 
Collagen, Glucose-Regulated Proteins (GRP94 and GRP78), Rat Growth Honnone, Human 
Semm Amyloid A (SAA), Troponin I (TN I), Platelet-Derived Growth Factor, Duchenne 
25 MuscularDystro P hy,SV40^ 

Immunodeficiency Virus, Cytomegalovirus, Gibbon Ape Leukemia Virus. Inducible promoter 
elements and their associated inducers are listed in Table 2 below. This list is not intended to be 
exhaustive of al, the possible elements involved in the promotion of transgene expression but 
merely, to be exemplary thereof. Additionally, any promoter/enhancer combination (as per the 
30 Eukaryotic Promoter Data Base EPDB) could also be used to drive expression of the gene 
Eukaryouc cells can support cytoplasmic transcription from certain bacterial promoters if the 
appropnate bacteria, polymerase is provided, either as part of the delivery complex or as an 
additional genetic expression construct. 

35 Enhancers were originally detected as genetic elements that increased transcription from 

a promoter located at a distant position on the same molecule of DNA. This ability to act over 
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. large disanc, had li.de precedent in classic Sadies of protayotic transcriptional Ration. 
S„bse,uen. work showed tha. regions of DNA with enhartccr activiry are organized much hke 
promoters. TM is, .hey are composed of many individual e.ements, each of which binds ,o one 

or more transcriptional proteins. 

The basic distinction between enhancers and promoters is operational. An enhancer 
region as a whole must be able to stimulate transcription a, a distance; this need no, be true of a 
promoter region or its component element, On the other hand, a promoter must have one o, 
more elements tat direct iniuation of RNA synthesis a. a particular she and in a parucular 
orientation, whereas enhancers lack these specificities. Promoters and enhancers are often 
overlapping and contiguous, often seeming to bave a very similar modular organ.za.ton. 
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Table 2 



Element 



Inducer 



MTII 



Phorbol Ester (TPA) 
Heavy metals 

MMTV (mouse mammary tumor Glucocorticoids 
virus) 



B-Interferon 

Adenovirus 5 E2 
c-jun 

Collagenase 
Stromelysin 
SV40 

Murine MX Gene 
GRP78 Gene 
a-2-Macroglobulin 
Vimentin 



poIy(rI)X 
poly(rc) 

Ela 

Phorbol Ester (TPA), H 2 0 2 
Phorbol Ester (TPA) 
Phorbol Ester (TPA), IL-1 
Phorbol Ester (TPA) 

Interferon, Newcastle Disease Virus 

A23187 

IL-6 

Serum 



Table 2 -Continued 



Element 



Inducer 



MHC Class I Gene H-2kB 

HSP70 

Proliferin 

Tumor Necrosis Factor 



Interferon 

Ela, S V40 Large T Antigen 
Phorbol Ester-TPA 
FMA 



Thyroid Stimulating Hormone a Thyroid Hormone 
Gene 

Use of the baculovirus system will involve high level expression from the powerful 
polyhedron promoter. 

One will typically include a polyadenylation signal to effect proper polyadenylation of 
the transcript. The nature of the polyadenylation signal is not believed to be crucial to the 
successful practice of the invention, and any such sequence may be employed. Preferred 
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embedments include ,he SV40 ^ and .be bovine grow* — 

polyadenyiation signal, convenient and known .0 function weU in various targe, ceils. Aiso 
contemplated as an e,= m en, of the expression cassette is a terminator. These elements cun serve 

A specific initiation signal a,so may be required for efficient transition of coding 
sequence, These signals inciude the ATO initiation codon and adjacent sequence, 
Exogenous transanal con.ro, s,g„a,s, inciuding the ATO initiation codon, May nee .0 be 
proved. One of ordinary sktll in the art w„u,d readi.y he cap* of determine ,h,s and 
providing the necessary signa,, 1, is we,, known that the initiation codon must be m-frame 
5 with the reading frame of the desired coding sequence to ensure — of the en«.,e tnse* 
The exogenous transiationa, contro, signals and initiation ccAms can be Cher 
synthetic. The efficiency of expression may be enhanced by the indusion of appropriate 
transcription enhancer elements (Birtner « cl , 1987). 

,„ „, certain embodiment it may be desirable ,0 inCude special^ region know, as 

" telomeres a, the end o, a genome sequence. Telomeres are repeated sequences found . 
chromosome ends and i, has long been know, mat chromosomes with truncated e nd s a* 
unstable, tend ,0 fuse with other chromosomes and are otherwise ,os, dunng ce 
Some da. suggest tha, telome^s interaction the nucleoprotein complex and the nuc,ear mam, 
25 one putative role for telomeres includes stabili* chromosomes and shielding the ends from 
degradative enzyme. 

Another possible role for telomeres is in replication. According .0 present docuine, 
replicas of DMA requtres * from short RNA primers annexed ,0 *e 3^nd of *e 
„ Ip,««. The resul, of mis mechanism is an -end rephcation problem- in whtoh *e reg on 

me progressive truncation of *e chromosome. It is though, mat telomeres may provtdea 
huffer against this effect, a, ,eas, until they are themselves eliminated by thts effect. A fcrfcer 
smicture to be included in DNA segments is a centromere. 



35 
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In certain embodiments of the invention, the delivery of a nucleic acid in a cell may be 
^entitled in vitro or in v/v 0 by including a marker in the expression construct. The marker 
would result in an identifiable change to the transfected cell permitting easy identification of 

expression. 



10 A number of selection systems may be used, including, but not limited, to the herpes 

s-mplex vinis thymidine kinase (Wigler era!., 1977), hypoxanthine-guanine 
Phosphotransferase (Szybalska et al, 1962) and adenine phosphoribosyltransferase 
genes (Lowy et al, ,980), in ft", h m t or aprt cells, respectively. Also, antimetabolite 
^stance can be used as the basis of selection for dhjr, which confers resistance to 

15 methotrexate (Wigler et al, ,980; O'Hare et al, 1981); & t, which confers resistance to 
mycophenolic acid (Mulligan etal, ,981); neo, which confers resistance to the aminoglycoside 
G-418 (Colberre-Garapin et al, 1981); and hygro, which confers resistance to hygromycin. 

Usually the inclusion of a drug selection marker aids in cloning and in the selection of 
20 transformants, for example, neomycin, puromycin, hygromycin, DHFR, GPT, zeocin and 
histidinol. Alternatively, enzymes such as herpes simplex virus thymidine kinase (,*) 
(eukaryotic) or chloramphenicol acetyltransferase (CAT) (prokaryotic) may be employed 
Immunologic markers also can be employed. The selectable marker employed is not believed 
to be important, so long as it is capable of being expressed simultaneously with the nucleic acid 
25 encoding a gene product. Further examples of selectable markers are well known to one of skill 
in the art. 

In certain embodiments of the invention, the use of internal ribosome binding sites 
(IRES) elements are used to create multigene, or polycistronic, messages. IRES elements are 

30 able to bypass the ribosome scanning model of 5' methylated Cap dependent translation and 
begin translation at internal sites (Pelletier and Sonenberg, 1988). IRES elements from two 
members of the picanovirus family (polio and encephalomyocarditis) have been described 
(Pelletier and Sonenberg, 1988), as well an IRES from a mammalian message (Macejak and 
Samow, 1991). IRES elements can be linked to heterologous open reading frames. Multiple 

35 open reading frames can be transcribed together, each separated by an IRES, creating 
polycistronic messages. By virtue of the IRES element, each open reading frame is accessible 
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,„ tomes for efficient —ion. Multiple genes can be efficiently expressed using a single 
promoter/enhancer to transcribe a single message. 

Any heterologous open reading frame can be linked to IRES element, TOs includes 
genes for secreted proteins, multi-subunit proteins, encoded by independent genes, intiace,,ular 
0 or membrane-bound proteins and selectable markers. In this way, expression of several 
proteins can be simultaneously engineered into a cell with a single construe, and a stngle 

selectable marker. 

D Encoded Proteins 
, 5 in this application, the inventors use genetic information for creative or synthetic 

purposes. The complete genome sequence will give a catalog of all genes necessary for the 
survtval reproduction, evolution and speciation of an orgarnsms and, given suitable htgh tech 
,oo,s genomic information may be modified or even created from -scratch" in order ,0 
synthesize life. Thus i, is contemplated that a combination of suitable energy generation genes, 
2 0 regulatory genes, and outer functional genes could be constructed which would be sufficient to 
render an artificial organism with the basic functionalities to enable independen, surv,val. 

To mee, mis goal, the present invention utilizes known cDNA sequences for any given 
genetoexpressproteinsinanartificialorganism. Any proteinso expressed in this inventionmay 
25 be modified for particular purposes according to memo* well know,, to those of sktll m ute art. 
For example, particular peptide residues may be denvatized or chemically modified in order to 
alter the immune response or to pemti, coupling of ,he peptide to other agents. 1. also is posstble 
„ change particular amino acids within the peptides without disturbing the overall structure or 
antigenicityofthe peptide. Such changesare,herefore,ermed"conservative-changesandte„d to 

are relevant factors in determiningwhichsubslitutionsare conservative. 

Once the entire coding sequence of a gene has been determined, the gene can be inserted 
into an appropriate expression system. The gene can be expressed in any number of differem 
35 recombinant DNA expression systems to genera,, large amounts of the polypeptide product. 
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which can then be purified and used to vaccinate animals to generate antisera with which further 
studies may be conducted. 



Examples of expression systems known to the skilled practitioner in the art include 
bacteria such as E coli, yeast such as Saccharomyces cerevisia and Pichia pastoris, baculovirus, 
and mammalian expression systems such as in COS or CHO cells. In one embodiment! 
polypeptides are expressed in E coli and in baculovirus expression systems. A complete gene can' 
be expressed or, alternatively, fragments of the gene encoding portions of polypeptide can be 
produced. 

In one embodiment, the gene sequence encoding the polypeptide is analyzed to detect 
putative transmembrane sequences. Such sequences are typically very hydrophobic and are 
readily detected by the use of standard sequence analysis software, such as MacVector (IBI, New 
Haven, CT). The presence of transmembrane sequences is often deleterious when a recombinant 
protein is synthesized in many expression systems, especially E. coli, as it leads to the production 
of insoluble aggregates that are difficult to renature into the native conformation of the protein. 
Deletion of transmembrane sequences typically does not significantly alter the conformation of 
the remaining protein structure. 

Moreover, transmembrane sequences, being by definition embedded within a membrane, 
are inaccessible. Therefore, antibodies to these sequences will not prove useful for in vivo or in 
situ studies. Deletion of transmembrane-encoding sequences from the genes used for expression 
canbeachievedbystandardtechniques. For example, fortuitously-placedrestriction enzyme sites 
can be used to excise the desired gene fragment, or PCRTM-type amplification can be used to 
amplify only the desired part of the gene. The skilled practitioner will realize that such changes 
must be designed so as not to change the translational reading frame for downstream portions of 
the protein-encoding sequence. 



In one embodiment, computer sequence analysis is used to determine the location of the 
predicted major antigenic determinant epitopes of the polypeptide. Software capable of carrying 
out this analysis is readily available commercially, for example MacVector (IBI, New Haven, 
CT). The software typically uses standard algorithms such as the Kyte/Doolittle or Hopp/Woods 
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5 methods for locating hydrophihc sequences which are characteristically found on the surface of 
proteins and are, therefore, likely to act as antigenic determinants. 

Once this analysis is made, polypeptides can be prepared that contain at least the essential 
features of the antigenic determinant and that can be employed in the generation of anbsera 
10 against the polypeptide. Minigenes or gene fusions encoding these determinants can be 
constructed and inserted into expression vectors by standard methods, for example, usmg PCR™ 

methodology. 

The gene or gene fragment encoding a polypeptide can be inserted into an expression 
15 vector by standard subcloning techniques. In one embodiment, an E coli expression vector a 
used that produces the recombinant polypeptide as a fusion protein, allowing rapid affimty 
purification of the protein. Examples of such fusion protein expression systems are the 
glutathioneS-transferasesystem (Pharmacia, Piscataway,NJ), the maltose binding protem system 
(NEB, Beverley, MA), the FLAG system (IBI, New Haven, CT), and the 6xHis system (Qmgen, 
20 Chatsworth,CA). 

Some of these systems produce recombinant polypeptides bearing only a small number of 
additional amino acids, which are unlikely to affect the antigenic ability of the recombmant 
polypeptide. For example, both the FLAG system and the 6xHis system add only short 
25 sequences, both of that are known to be poorly antigenic and which do not adversely affect 
folding of the polypeptide to its native conformation. Other fusion systems produce polypeptide 
where it is desirable to excise the fusion partner from the desired polypeptide. In one 
embodiment, the fusion partner is linked to the recombinant polypeptide by a peptide sequence 
containing a specific recognition sequence for a protease. Examples of suitable sequences are 
30 those recognized by the Tobacco Etch Virus protease (Life Technologies, Gaithersburg, MD) or 
Factor Xa (New England Biolabs, Beverley, MA). 

Recombinantbacterial cells, for example E coli, are grown in any of a number of suitable 
mediator example LB, and the expression of the recombinant polypeptide induced by addmg 
35 1PTG to the media or switching incubation to a higher temperature. After culturing the bactena 
for a further period of between 2 and 24 h, the cells are collectedby centrifugationand washedto 
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remove residual media. The bacterial cells are then lysed, for example, by disruption in a cell 
homogenizer and centrifuged to separate the dense inclusion bodies and cell membranes from the 
soluble cell components. This centrifugation can be performed under conditions whereby the 
dense inclusion bodies are selectively enriched by incorporation of sugars such as sucrose into the 
buffer and centrifugationat a selective speed. 



In another embodiment, the expression system used is one driven by the baculovirus 
polyhedron promoter. The gene encoding the polypeptide can be manipulated by standard 
techniques^ order to facilitate cloning into the baculovirus vector. One baculovirus vector is the 
pBlueBac vector (Invitrogen, Sorrento, CA). The vector carrying the gene for the polypeptide is 
1 5 transfected into Spodopterafrugiperda (Sf9) cells by standard protocols, and the cells are cultured 
and processed to produce the recombinant antigen. See Summers et al, A MANUAL OF 
METHODS FOR BACULOVIRUS VECTORS AND INSECT CELL CULTURE 
PROCEDURES, Texas Agricultural Experimental Station. 



In designing a gene that encodes a particular polypeptide, the hydropathic index of amino 
acids may be considered. Table 3 provides a codon table showing the nucliec acids that encode a 
particular amino acid. The importance of the hydropathic amino acid index in conferring 
interactive biologic function on a protein is generally understood in the art (Kyte & Doolittle, 
1 982). The following is a brief discussion of the the hydropathic amino acid index for use in the 
25 present invention. 
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Table 3 



Amino Acids 



Codons 



Alanine 


Ala 


Cysteine 


Cys 


Asparticacid 


Asp 


Glutamic acid 


Glu 


Phenylalanine 


Phe 


Glycine 


Gly 


Histidine 


His 


lsoleucine 


lie 


Lysine 


Lys 


Leucine 


Leu 


Methionine 


Met 


Asparagine 


Asn 


Proline 


Pro 


Glutamine 


Gin 


Arginine 


Arg 


Serine 


Ser 


Threonine 


Thr 


Valine 


Val 


Tryptophan 


Trp 


Tyrosine 


Tyr 



A 
C 
D 
E 
F 
G 
H 
I 

K 

L 

M 

N 

P 

Q 

R 

S 

T 

V 

W 

Y 



GCA GCC GCG GCU 
UGC UGU 
GAC GAU 
GAA GAG 
UUC UUU 

GGA GGC GGG GGU 
CAC CAU 
AUA AUC AUU 
AAA AAG 

UUA UUG CUA CUC CUG CUU 
AUG 

AAC AAU 

CCA CCC CCG CCU 
CAA CAG 

AGA AGG CGA CGC CGG CGU 
AGCAGU UCA UCC UCG UCU 
ACA ACC ACG ACU 
GUA GUC GUG GUU 
UGG 

UAC UAU . 



It is accepted that the relative hydropathic character of the amino acid contributes to the 
secondary structure of the resultant protein, which in turn defines the interaction of the protem 
,0 with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antrgens, 
and the like. 

Each amino acid has been assigned a hydropathic index on the basis of their 
hydrophobic^ and charge characteristics (Kyte ft Doolittle, 1982), these are: Isoleuone 
,5 (44 5)- valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); metmomne 
( + 1 9); alanine (+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosme 
(-1.3); proline (-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); 
asparagine (-3.5); lysine (-3.9); and arginine (-4.5). 

20 It is known in the art that certain amino acids may be substituted by other amino acids 

having a similar hydropathic index or score and still result in a protein with similar biolo gl cal 
activity ie, still obtain a biological functionally equivalent protein. In making such changes, 
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5 the substitution of amino acids whose hydropathic indices are within ±2 is preferred, those 
which are within ±1 are particularly preferred, and those within ±0.5 are even more particularly 
preferred. 

It is also understood in the art that the substitution of like amino acids can be made 
10 effectively on the basis of hydrophilicity. U.S. Patent 4,554,101, incorporated herein by 
reference, states that the greatest local average hydrophilicity of a protein, as governed by the 
hydrophilicity of its adjacent amino acids, correlates with a biological property of the protein. 

As detailed in U.S. Patent 4,554,101, the following hydrophilicity values have been 
1 5 assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 ± 1); glutamate 
(+3.0 ± 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); 
proline (-0.5 ± 1); alanine (-0.5); histidine -0.5); cysteine (-1.0); methionine (-1.3); valine (- 
1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2:3); phenylalanine (-2.5); tryptophan (-3.4). 



20 



25 



It is understood that an amino acid can be substituted for another having a similar 
hydrophilicity value and still obtain a biologically equivalent and immunologically equivalent 
protein. In such changes, the substitution of amino acids whose hydrophilicity values are within 
±2 is preferred, those that are within ±1 are particularly preferred, and those within ±0.5 are 
even more particularly preferred. 



As outlined above, amino acid substitutions are generally based on the relative 
similarity of the amino acid side-chain substituents, for example, their hydrophobicity, 
hydrophilicity, charge, size, and the like. Exemplary substitutions that take various of the 
foregoing characteristics into consideration are well known to those of skill in the art and 
30 include: arginine and lysine; glutamate and aspartate; serine and threonine; glutamine and 
asparagine; and valine, leucine and isoleucine. 

E. Expression of and Delivery of Genes 
I. Expression 

35 Once the designer gene, genome or biological system has been made according the 

methods described herein, the polynucleotides can be expressed as encoded peptides or proteins 
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5 of the gene, genome or biological system. The engineering of the polynucleotides for 
expression in a prokaryotic or eukaryotic system may be performed by techniques generally 
known to those of skill in recombinant expression. Therefore, promoters and other elements 
specific to a bacterial mammalian or other system may be encluded in the polynucleotide 
sequence. It is believed that virtually any expression system may be employed in the 

1 0 expression of the claimed nucleic acid sequences. 

The artificially generated polynucleotide sequences are suitable for eukaryotic 
expression, as the host cell will generally process the genomic transcripts to yield functional 
mRNA for translation into protein. It is believed that the use of a designer gene version will 
provide advantages in that the size of the gene will generally be much smaller and more readily 
employed to transfect the targeted cell than will a genomic gene, which will typically be up to 
an order of magnitude larger than the designer gene. However, the inventor does not exclude 
the possibility of employing a genomic version of a particular gene where desired. 



15 



20 



25 



As used herein, the terms "engineered" and "recombinant" cells are intended to refer to a 
cell into which an exogenous polynucleotide described herein has been introduced. Therefore, 
engineered cells are distinguishable from naturally-occurring cells which do not contain a 
recombinant introduced exogenous polynucleotide. Engineered cells are thus cells having a 
gene or genes introduced through the hand of man. Recombinant cells include those having an 
introduced polynucleotides, and also include polynucleotides positioned adjacent to a promoter 
not naturally associated with the particular introduced gene. 



To express a recombinant encoded protein or peptide, whether mutant or wild-type, in 
accordance with the present invention one would prepare an expression vector that comprises 

30 one of the claimed isolated nucleic acids under the control of one or more promoters. To bring 
a coding sequence "under the control of a promoter, one positions the 5' end of the 
translations initiation site of the reading frame generally between about 1 and 50 nucleotides 
"downstream" of (i.e., 3' of) the chosen promoter. The "upstream" promoter stimulates 
transcription of the inserted DNA and promotes expression of the encoded recombinant protein. 

35 This is the meaning of "recombinant expression" in the context used here. 
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Many standard techniques are available to construct expression vectors containing the 
appropriate nucleic acids and transcriptional/translational control sequences in order to achieve 
protein or peptide expression in a variety of host-expression systems. Cell types available for 
expression include, but are not limited to, bacteria, such as E coli and B. subtilis transformed 
with recombinant phage DNA, plasmid DNA or cosmid DNA expression vectors. 

Certain examples of prokaryotic hosts are E. coli strain RR1, E. coli LE392, E. coli B, 
E. coli X 1776 (ATCC No. 31537) as well as E. coli W31 10 (F-, lambda-, prototrophic, ATCC 
No. 273325); bacilli such as Bacillus subtilis; and other enterobacteriaceae such as Salmonella 
typhimurium, Serratia marcescens, and various Pseudomonas species. 

In general, plasmid vectors containing replicon and control sequences that are derived 
from species compatible with the host cell are used in connection with these hosts. The vector 
ordinarily carries a replication site, as well as marking sequences that are capable of providing 
phenotypic selection in transformed cells. For example, E. coli is often transformed using 
PBR322, a plasmid derived from an E. coli species. Plasmid pBR322 contains genes for 
ampicillin and tetracycline resistance and thus provides easy means for identifying transformed 
cells. The pBR322 plasmid, or other microbial plasmid or phage must also contain, or be 
modified to contain, promoters that can be used by the microbial organism for expression of its 
own proteins. 

In addition, phage vectors containing replicon and control sequences that are compatible 
with the host microorganism can be used as transforming vectors in connection with these 
hosts. For example, the phage lambda GEM™-1 1 may be utilized in making a recombinant 
phage vector that can be used to transform host cells, such as E coli LE392. 

Further useful vectors include pIN vectors (Inouye et al, 1985); and pGEX vectors, for 
use in generating glutathione S-transferase (GST) soluble fusion proteins for later purification 
and separation or cleavage. Other suitable fusion proteins are those with 6-galactosidase, 
ubiquitin, or the like. 
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Promoters that are most commonly used in recombinant DNA construction include the 
p-lactamase (penicillinase), lactose and tryptophan (trp) promoter systems. While these are the 
most commonly used, other microbial promoters have been discovered and utilized, and details 
concerning their nucleotide sequences have been published, enabling those of skill in the art to 
ligate them functionally with plasmid vectors. 

For expression in Saccharomyces, the plasmid YR P 7, for example, is commonly used 
(Stinchcomb et al, 1979; Kingsman et al, 1979; Tschemper et al, 1980). This plasmid 
contains the trp\ gene, which provides a selection marker for a mutant strain of yeast lacking 
the ability to grow in tryptophan, for example ATCC No. 44076 or PEP4-1 (Jones, 1977). The 
1 5 presence of the trp\ lesion as a characteristic of the yeast host cell genome then provides an 
effective environment for detecting transformation by growth in the absence of tryptophan. 

Suitable promoting sequences in yeast vectors include the promoters for 
3- P hosphoglycerate kinase (Hitzeman et al, 1980) or other glycolytic enzymes (Hess et al, 

20 1968; Holland et al, 1978), such as enolase, glyceraldehyde-3-phosphate dehydrogenase, 
hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3- 
phosphoglycerate mutase, pyruvate kinase, triphosphate isomerase, phosphoglucose 
isomerase, and glucokinase. In constructing suitable expression plasmids, the termmaUon 
sequences associated with these genes are also ligated into the expression vector 3' of the 

25 sequence desired to be expressed to provide polyadenylation of the mRNA and termination. 

Other suitable promoters, which have the additional advantage of transcription 
controlled by growth conditions, include the promoter region for alcohol dehydrogenase 2, 
isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, 
30 and the aforementioned glyceraldehyde-3-phos P hate dehydrogenase, and enzymes response 
for maltose and galactose utilization. 

In addition to micro-organisms, cultures of cells derived from multicellular organisms 
may also be used as hosts. In principle, any such cell culture is workable, whether from 
35 vertebrate or invertebrate culture. In addition to mammalian cells, these include insect cell 
systems infected with recombinant virus expression vectors {e.g., baculovirus); and plant cell 
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5 systems infected with recombinant virus expression vectors {e.g., cauliflower mosaic virus, 
CaMV; tobacco mosaic virus, TMV) or transformed with recombinant plasmid expression 
vectors {e.g., Ti plasmid) containing one or more coding sequences. 

In a useful insect system, Autograph califomica nuclear polyhidrosis virus (AcNPV) is 
10 used as a vector to express foreign genes. The virus grows in Spodopterafrugiperda cells. The 
isolated nucleic acid coding sequences are cloned into non-essential regions (for example the 
polyhedron gene) of the virus and placed under control of an AcNPV promoter (for example, 
the polyhedron promoter). Successful insertion of the coding sequences results in the 
inactivation of the polyhedron gene and production of non-occluded recombinant virus (i.e., 
15 virus lacking the proteinaceous coat coded for by the polyhedron gene). These recombinant' 
viruses are then used to infect Spodoptera frugiperda cells in which the inserted gene is 
expressed (e.g. , U.S. Patent No. 4,2 1 5,05 1 ). 
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Examples of useful mammalian host cell lines are VERO and HeLa cells, Chinese 
hamster ovary (CHO) cell lines, WI38, BHK, COS-7, 293, HepG2, NIH3T3, RIN and MDCK 
cell lines. In addition, a host cell may be chosen that modulates the expression of the inserted 
sequences, or modifies and processes the gene product in the specific fashion desired. Such 
modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may be 
important for the function of the encoded protein. 

Different host cells have characteristic and specific mechanisms for the post- 
translational processing and modification of proteins. Appropriate cell lines or host systems 
can be chosen to ensure the correct modification and processing of the foreign protein 
expressed. Expression vectors for use in mammalian cells ordinarily include an origin of 
replication (as necessary), a promoter located in front of the gene to be expressed, along with 
any necessary ribosome binding sites, RNA splice sites, polyadenylation site, and 
transcriptional terminator sequences. The origin of replication may be provided either by 
construction of the vector to include an exogenous origin, such as may be derived from SV40 or 
other viral (e.g., Polyoma, Adeno, VSV, BPV) source, or may be provided by the host cell 
chromosomal replication mechanism. If the vector is integrated into the host cell chromosome, 
the latter is often sufficient. 
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The promoters may be derived from the genome of mammalian cells (e.g., 
metallothionein promoter) or from mammalian viruses (e.g., the adenovirus late promoter; the 
vaccinia virus 7.5K promoter). Further, it is also possible, and may be desirable, to uuhze 
promoter or control sequences normally associated with the desired gene sequence, proved 
10 such control sequences are compatible with the host cell systems. 

Specific initiation signals may also be required for efficient translation of the claimed 
isolated nucleic acid coding sequences. These si g na.s include the ATG initiation codon and 
adjacent sequences. Exogenous translation^ control signals, including the ATG ma.at.on 
15 codon, may additionally need to be provided. One of ordinary skill in the art would read.ly be 
capable of determining this need and providing the necessary signals. It is well known that the 
initiation codon must be in-frame (or in-phase) with the reading frame of the desired codmg 
sequence to ensure translation of the entire insert. These exogenous translation^ control 
signals and initiation codons can be of a variety of origins, both natural and synthetu. The 
20 efficiency of expression may be enhanced by the inclusion of appropriate transcription enhancer 
elements or transcription terminators (Bittner et ai, 1987). 

In eukaryotic expression, one will also typically desire to incorporate into the 
transcriptional unit an appropriate polyadenylation site (e.g., 5'-AATAAA-3') if one was not 
25 contained within the original cloned segment. Typically, the poly A addition site rs placed 
about 30 to 2000 nucleotides "downstream" of the termination site of the protein at a portion 
prior to transcription termination. 

For long-term, high-yield production of recombinant proteins, stable expression is 
30 preferred. For example, cell lines that stably express constructs encoding proteins may be 
engineered. Rather than using expression vectors that contain viral origins of replication, host 
cells can be transformed with vectors controlled by appropriate expression control elements 
(e g. promoter, enhancer, sequences, transcription terminators, polyadenylation sites, etc.), and 
a selectable marker. Following the introduction of foreign DNA, engineered cells may be 
35 allowed to grow for 1-2 days in an enriched medium, and then are switched to a selectrve 
medium. The selectable marker in the recombinant plasmid confers resistance to the select.on 
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and allows cells to stably integrate the plasmid into their chromosomes and grow to form foci, 
which in turn can be cloned and expanded into cell lines. 

It is contemplated that the nucleic acids of the invention may be "overexposed", i.e., 
expressed in increased levels relative to its natural expression in human cells, or even relative to 
the expression of other proteins in the recombinant host cell. Such overexpression may be 
assessed by a variety of methods, including radio-labeling and/or protein purification. 
However, simple and direct methods are preferred, for example, those involving SDS/PAGE 
and protein staining or western blotting, followed by quantitative analyses, such as 
densitometry scanning of the resultant gel or blot. A specific increase in the level of the 
1 5 recombinant protein or peptide in comparison to the level in natural human cells is indicative of 
overexpression, as is a relative abundance of the specific protein in relation to the other proteins 
produced by the host cell and, e.g., visible on a gel. 
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II. Delivery 

In various embodiments of the invention, the expression construct may comprise a virus 
or engineered construct derived from a viral genome. The ability of certain viruses to enter 
cells via receptor-mediated endocytosis and to integrate into the host cell genome and express 
viral genes stably and efficiently have made them attractive candidates for the transfer of 
foreign genes into mammalian cells (Ridgeway, 1988; Nicolas and Rubenstein, 1988; Baichwal 
25 and Sugden, 1986; Temin, 1986). The first viruses used as vectors were DNA viruses including 
the papovaviruses (simian virus 40, bovine papilloma virus, and polyoma) (Ridgeway, 1988; 
Baichwal and Sugden, 1986) and adenoviruses (Ridgeway, 1988; Baichwal and Sugden, 1986) 
and adeno-associated viruses. Retroviruses also are attractive gene transfer vehicles (Nicolas 
and Rubenstein, 1988; Temin, 1986) as are vaccina virus (Ridgeway, 1988) and adeno- 
30 associated virus (Ridgeway, 1988). Such vectors may be used to (i) transform cell lines in vitro 
for the purpose of expressing proteins of interest or (ii) to transform cells in vitro or in vivo to 
provide therapeutic polypeptides in a gene therapy scenario. Herpes simplex virus (HSV) is 
another attractive candidate, especially where neurotropism is desired. HSV also is relatively 
easy to manipulate and can be grown to high titers. Thus, delivery is less of a problem, both in 
35 terms of volumes needed to attain sufficient MOI and in a lessened need for repeat dosings. 
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5 With the recent recognition of defective hepatitis B viruses, new insight was gained into 

the structure-function relationship of different viral sequences. In vitro studies showed that the 
virus could retain the ability for helper-dependent packaging and reverse transcripts desprte 
the deletion of up to SQo/o of its genome (Horwich et ai, 1990). This suggested that large 
portions of the genome could be replaced with foreign genetic material. The hepatotropism and 

10 persistence (integration) were particularly attractive properties for liver-directed gene transfer. 
Chang et ai, recently introduced the chloramphenicol acetyltransferase (CAT) gene into duck 
hepatitis B virus genome in the place of the polymerase, surface, and pre-surface codmg 
sequences. It was co-transfected with wild-type virus into an avian hepatoma cell line. Culture 
nntfa containing high titers of the recombinant virus were used to infect primary duckhng 

15 hepatocytes. Stable CAT gene expression was detected for at least 24 days after transfecnon 
(Change/ ai, 1991). 

Several non-viral methods for the transfer of expression constructs into cultured 
m ammaHan cells also are contemplated by the present invention. These include calcmm 
20 phosphate precipitation (Graham and Van Der Eb, 1973; Chen and Okayama, 1987; Rmpe et 
ai 1990) DEAE-dextran (Gopal, 1985), electroporation (Tur-Kaspa et ai, 1986; Potter et ai, 
1984) direct microinjection (Harland and Weintraub, 1985), DNA-loaded liposomes (N.colau 
and Sene, 1982; Fraley et ai, 1979) and lipofectamine-DNA complexes, cell somcaUon 
(Fechheimer et ai, 1987), gene bombardment using high velocity microprojectiles (Yang et ai, 
25 1990), and receptor-mediated transection (Wu and Wu, 1987; Wu and Wu, 1988). Some of 
these techniques may be successfully adapted for in vivo or ex vivo use. 

Once the expression construct has been delivered into the cell the nucleic acid encoding 
the gene of interest may be positioned and expressed at different sites. In certain embodiments, 
30 the nucleic acid encoding the gene may be stably integrated into the genome of the cell. Tins 
integration may be in the cognate location and orientation via homologous recombinabon (gene 
replacement) or it may be integrated in a random, non-specific location (gene augments). 
1„ yet further embodiments, the nucleic acid may be stably maintained in the cell as a separate, 
episomal segment of DNA. Such nucleic acid segments or "episomes" encode sequences 
35 sufficient to permit maintenance and replication independent of or in synchronization wtfh the 
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host cell cycle. How the expression construct is delivered to a cell and where in the cell the 
nucleic acid remains is dependent on the type of expression construct employed. 

In one embodiment, the expression construct may simply consist of naked recombinant 
DNA or plasmids. Transfer of the construct may be performed by any of the methods 
mentioned above which physically or chemically permeabilize the cell membrane. This is 
particularly applicable for transfer in vitro but it may be applied to /„ vivo use as well. 
Dubensky et al, (1984) successfully injected polyomavirus DNA in the form of calcium 
phosphate precipitates into liver and spleen of adult and newborn mice demonstrating active 
viral replication and acute infection. Benvenisty and Neshif (1986) also demonstrated that 
direct intraperitoneal injection of calcium phosphate-precipitated plasmids results in expression 
of the transfected genes. It is envisioned that DNA encoding a gene of interest may also be 
transferred in a similar manner in vivo and express the gene product. 

Another embodiment of the invention for transferring a naked DNA expression 
construct or DNA segment into cells may involve particle bombardment. This method depends 
on the ability to accelerate DNA-coated microprojectiles to a high velocity allowing them to 
pierce cell membranes and enter cells without killing them (Klein et al, 1987). Several devices 
for accelerating small particles have been developed. One such device relies on a high voltage 
discharge to generate an electrical current, which in turn provides the motive force (Yang et al, 
1990). The microprojectiles used have consisted of biologically inert substances such as 
tungsten or gold beads. 



Selected organs including the liver, skin, and muscle tissue of rats and mice have been 
bombarded/ n v/v 0 (Yang,/a/., 1990; Zelenin etal, 1991). This may require surgical exposure 
30 of the tissue or cells, to eliminate any intervening tissue between the gun and the target organ, 
/•ft, « vivo treatment. Again, DNA encoding a particular gene may be delivered via this 
method and still be incorporated by the present invention. 

In a further embodiment of the invention, the DNA segment or expression construct 
35 may be entrapped in a liposome. Liposomes are vesicular structures characterized by a 
phospholipid bilayer membrane and an inner aqueous medium. Multilamellar liposomes have 
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multiple lipid layers separated by aqueous medium. They form spontaneously when 
phospholipids are suspended in an excess of aqueous solution. The lipid components undergo 
self-rearrangement before the formation of closed structures and entrap water and dissolved 
solutes between the lipid bilayers (Ghosh and Bachhawat, 1991). Also contemplated are 
lipofectamine-DNA complexes. 



10 



Liposome-mediated nucleic acid delivery and expression of DNA in vitro has been very 
successful. Wong et al, (1980) demonstrated the feasibility of liposome-mediated delivery and 
expression of foreign DNA in cultured chick embryo, HeLa and hepatoma cells. Nicolau et al, 
. (1987) accomplished successful liposome-mediated gene transfer in rats after intravenous 

15 injection. 

In certain embodiments, the liposome may be complexed with a hemagglutinating virus 
(HVJ) This has been shown to facilitate fusion with the cell membrane and promote cell entry 
of liposome-encapsulated DNA (Kaneda et al, 1989). In other embodiments, the liposome 

20 may be complexed or employed in conjunction with nuclear non-histone. chromosomal proteins 
(HMG-1) (Kato et al, 1991). In yet further embodiments, the liposome may be complexed or 
employed in conjunction with both HVJ and HMG-1. In that such expression constructs have 
been- successfully employed in transfer and expression of nucleic acid in vitro and in vivo, then 
they are applicable for the present invention. Where a bacterial promoter is employed in the 

25 DNA construct, it also will be desirable to include within the liposome an appropriate bacterial 
polymerase. 

Other expression constructs which can be employed to deliver a nucleic acid encoding a 
particular gene into cells are receptor-mediated delivery vehicles. These take advantage of the 
30 selective uptake of macromolecules by receptor-mediated endocytosis in almost all eukaryotic 
cells. Because of the cell type-specific distribution of various receptors, the delivery can be 
highly specific (Wu and Wu, 1993). 

Receptor-mediated gene targeting vehicles generally consist of two components: a cell 
35 receptor-specific ligand and a DNA-binding agent. Several ligands have been used for 
receptor-mediated gene transfer. The most extensively characterized ligands are 
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asialoorosomucoid (ASOR) (Wu and Wu, 1987) and transferrin (Wagner et al, 1990). 
Recently, a synthetic neoglycoprotein, which recognizes the same receptor as ASOR, has been 
used as a gene delivery vehicle (Ferkol et al, 1993; Perales et al, 1994) and epidermal growth 
factor (EGF) has also been used to deliver genes to squamous carcinoma cells (Myers, EPO 
0273085). 



In other embodiments, the delivery vehicle may comprise a ligand and a liposome. For 
example, Nicolau et al., (1987) employed lactosyl-ceramide, a galactose-terminal 
asialganglioside, incorporated into liposomes and observed an increase in the uptake of the 
insulin gene by hepatocytes. Thus, it is feasible that a nucleic acid encoding a particular gene 
15 also may be specifically delivered into a cell type such as lung, epithelial or tumor cells, by any 
number of receptor-ligand systems with or without liposomes. 

In certain embodiments, gene transfer may more easily be performed under ex vivo 
conditions. Ex vivo gene therapy refers to the isolation of cells from an organism, the delivery 
of a nucleic acid into the cells in vitro, and then the return of the modified cells back into an 
organism. This may involve the surgical removal of tissue/organs from an animal or the 
primary culture of cells and tissues. Anderson et al., U.S. Patent 5,399,346, and incorporated 
herein in its entirety, disclose ex vivo therapeutic methods. 

25 F. Oligonucleotide Synthesis 

Oligonucleotide synthesis is well known to those of skill in the art. Various different 
mechanisms of oligonucleotide synthesis have been disclosed in for example, U.S. Patents. 
4,659,774, 4,816,571, 5,141,813, 5,264,566, 4,959,463, 5,428,148, 5,554,744, 5,574,146, 
5,602,244, each of which is incorporated herein by reference. 

30 

Phosphoramidite chemistry (Beaucage, and Lyer, 1992) has become by far the most 
widely used coupling chemistry for the synthesis of oligonucleotides. As is well known to those 
skilled in the art, phosphoramidite synthesis of oligonucleotides involves activation of 
nucleoside phosphoramidite monomer precursors by reaction with an activating agent to form 
35 activated intermediates, followed by sequential addition of the activated intermediates to the 



10 



15 



PCT/US98/19312 

WO 99/14318 ^ 

growing oligonucleotide chain (generally anchored at one end to a suitable solid support) to 

form the oligonucleotide product. 

Tetrazole is commonly used for the activation of the nucleoside phosphoramidite 
monomers. Tetrazole has an acidic proton which presumably protonates the basic nitrogen of 
the diisopropylamino phosphine group, thus making the diisopropylamino group a leaving 
group. The negatively charged tetrazolium ion then makes an attack on the trivalent 
phosphorous, forming a transient phosphorous tetrazolide species. The 5"-OH group of the 
solid support bound nucleoside then attacks the active trivalent phosphorous species, resultmg 
in the formation of the internucleotide linkage. The trivalent phosphorous is finally oxidized to 
the pentavalent phosphorous. The US patents listed above describe other activators and solid 
supports for oligonucleotide synthesis. 

High throughput oligonucleotide synthesis can be achieved using a synthesizer. The 
Genome Science and Technology Center, as one aspect of the automation development effort, 
recently developed a high throughput large scale oligonucleotide synthesizer. This instrument, 
denoted the MERMADE, is based on a 96-well plate format and uses robotic control to carry 
out parallel synthesis on 192 samples (2 96-well plates). This device has been variously 
described in the literature and in presentations, is generally available in the public domam 
(licensed from the University of Texas and available on contract from Avantec). The device 
25 has gone through various generations with differing operating parameters. 

The device may be used to synthesize 192 oligonucleotides simultaneously with 99% 
success. It has virtually 100% success for oligomers less than 60 bp; operates at 20 mM 
synthesis levels, and gives a product yield of >99% complete synthesis. Using these systems 
30 the inventor has synthesized over 10,000 oligomers used for sequencing, PGR™ amplification 
and recombinant DNA applications. For most uses, including cloning, synthesis success is 
sufficient such that post synthesis purification is not required. 

Once the genome has been synthesized using the methods of the present invention it 
35 may be necessary to screen the sequences for analysis of function. Specifically contemplated 
by the present inventor are chip-based DNA technologies such as those described by Haoa e, 
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al. (1996) and Shoemaker et al. (1996). Briefly, these techniques involve quantitative methods 
for analyzing large numbers of genes rapidly and accurately. By tagging genes with 
oligonucleotides or using fixed probe arrays, one can employ chip technology to segregate 
target molecules as high density arrays and screen these molecules on the basis of 
hybridization. See also Pease et al. (1994); Fodor et al. (1991). 



The use of combinatorial synthesis and high throughput screening assays are well 
known to those of skill in the art, e.g. 5,807,754; 5,807,683; 5,804,563; 5,789,162; 5,783,384; 
5,770,358; 5,759,779; 5,747,334;5,686,242; 5,198,346; 5,738,996; 5,733,743; 5,714,320; 
5,663,046 (each specifically incorporated herein by reference). These patents teach various 
1 5 aspects of the methods and compositions involved in the assembly and activity analyses of high 
density arrays of different polysubunits (polynucleotides or polypeptides). As such it is 
contemplated that the methods and compositions described in the patents listed above may be 
useful in assay the activity profiles of the compositions of the present invention. 



The present invention produces a replication competent polynucleotide. Viruses are 
naturally occurring replication competent pieces of DNA, to the extent that disclosure regarding 
viruses may be useful in the context of the present invention, the following is a disclosure of 
viruses. Researchers note that viruses have evolved to be able to deliver their DNA to various 
host tissues despite the human body's various defensive mechanisms. For this reason, 
25 numerous viral vectors have been designed by researchers seeking to create vehicles for 
therapeutic gene delivery. Some of the types of viruses that have been engineered are listed 
below. 



II. Adenovirus 

Adenovirus is a 36 kB, linear, double-strained DNA virus that allows substitution of 
large pieces of adenoviral DNA with foreign sequences up to 7 kB (Grunhaus and Horwitz, 
1992). Adenovirus DNA does not integrate into the host cell chromosomal because adenoviral 
DNA can replicate in an episomal manner. Also, adenoviruses are structurally stable, and no 
genome rearrangement has been detected after extensive amplification. Adenovirus can infect 
virtually all epithelial cells regardless of their cell cycle stage. This means that adenovirus can 
infect non-dividing cells. So far, adenoviral infection appears to be linked only to mild disease 
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5 such as acute respiratory disease in humans. This group of viruses can be obtained in high 
titers, e.g., 10 9 -10 n plaque-forming units per ml, and they are highly infective. 

Both ends of the viral genome contain 100-200 base pair inverted repeats (ITRs), which 
are cis elements necessary for viral DNA replication and packaging. The early (E) and late (L) 

10 regions of the genome contain different transcription units that are divided by the onset of viral 
DNA replication. The El region (El A and ElB) encodes proteins responsible for the 
regulation of transcription of the viral genome and a few cellular genes. The expression of the 
E2 region (E2A and E2B) results in the synthesis of the proteins for viral DNA replication. 
These proteins are involved in DNA replication, late gene expression and host cell shut-off 

15 (Renan, 1990). The products of the late genes, including the majority of the viral capsid 

proteins, are expressed only after significant processing of a single primary transcript issued by 
the major late promoter (MLP). The MLP, (located at 16.8 m.u.) is particularly efficient during 
the late phase of infection, and all the mRNA's issued from this promoter possess a 5'-tripartite 
leader (TPL) sequence which makes them preferred mRNA's for translation. 

20 

The E3 region encodes proteins that appears to be necessary for efficient lysis of Ad 
infected cells as well as preventing TNF-mediated cytolysis and CTL mediated lysis of infected 
cells. In general, the E4 region encodes is believed to encode seven proteins, some of which 
activate the E2 promoter. It has been shown to block host mRNA transport and enhance 
25 transport of viral RNA to cytoplasm. Further the E4 product is in part responsible for the 
decrease in early gene expression seen late in infection. E4 also inhibits E1A and E4 (but not 
ElB) expression during lytic growth. Some E4 proteins are necessary for efficient DNA 
replication however the mechanism for this involvement is unknown. E4 is also involved in 
post-transcriptional events in viral late gene expression; i.e., alternative splicing of the tripartite 
30 leader in lytic growth. Nevertheless, E4 functions are not absolutely required for DNA 
replication but their lack will delay replication. Other functions include negative regulation of 
viral DNA synthesis, induction of sub-nuclear reorganization normally seen during adenovirus 
infection, and other functions that are necessary for viral replication, late viral mRNA 
accumulation, and host cell transcriptional shut off. 
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II. Retroviruses 

The retroviruses are a group of single-stranded RNA viruses characterized by an ability 
to convert their RNA to double-stranded DNA to infected cells by a process of 
reverse-transcription (Coffin, 1990). The resulting DNA then stably integrates into cellular 
chromosomes as a provirus and directs synthesis of viral proteins. The integration results in the 
retention of the viral gene sequences in the recipient cell and its descendants. The retroviral 
genome contains three genes, gag, pol, and env that code for capsid proteins, polymerase 
enzyme, and envelope components, respectively. A sequence found upstream from the gag 
gene, termed y components is constructed (Mann et al, 1983). When a recombinant plasmid 
containing a human cDNA, together with the retroviral LTR and y sequences is introduced into 
this cell line (by calcium phosphate precipitation for example), the y sequence allows the RNA 
transcript of the recombinant plasmid to be packaged into viral particles, which are then 
secreted into the culture media (Nicolas and Rubenstein, 1988; Temin, 1986; Mann et al, 
1983). The media containing the recombinant retroviruses is then collected, optionally 
concentrated, and used for gene transfer. Retroviral vectors are able to infect a broad variety of 
cell types. However, integration requires the division of host cells (Paskind et al, 1975). 

The retrovirus family includes the subfamilies of the oncoviruses, the lentiviruses and 
the spumaviruses. Two oncoviruses are Moloney murine leukemia virus (MMLV) and feline 
leukemia virus (FeLV). The lentiviruses include human immunodeficiency virus (HIV), simian 
immunodeficiency virus (SIV) and feline immunodeficiency virus (FIV). Among the murine 
viruses such as MMLV there is a further classification. Murine viruses may be ecotropic, 
xenotropic, polytropic or amphotropic. Each class of viruses target different cell surface 
receptors in order to initiate infection. 

Further advances in retroviral vector design and concentration methods have allowed 
production of amphotropic and xenotropic viruses with titers of 10 8 to 10 9 cfu/ml (Bowles et 
al, 1996; Irwin et al, 1994; Jolly, 1994; Kitten et al, 1997). 



Replication defective recombinant retroviruses are not acute pathogens in primates 
(Chowdhury et al, 1991). They have been successfully applied in cell culture systems to 
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transfer the CFTR gene and generate cAMP-activated CI" secretion in a variety of cell types 
including human airway epithelia (Drumm et al, 1990, Olsen et al, 1992; Anderson et al, 
1991; Olsen et al, 1993). While there is evidence of immune responses to the viral gag and 
env proteins, this does not prevent successful readministration of vector (McCormack et al, 
1997). Further, since recombinant retroviruses have no expressed gene products other than the 
transgene, the risk of a host inflammatory response due to viral protein expression is limited 
(McCormack et al, 1997). As for the concern about insertional mutagenesis, to date there are 
no examples of insertional mutagenesis arising from any human trial with recombinant 
retroviral vectors. 

More recently, hybrid lentivirus vectors have been described combining elements of 
human immunodeficiency virus (HIV) (Naldini et al, 1996) or feline immunodeficiency virus 
(FIV) (Poeschla et al, 1998) and MMLV. These vectors transduce nondividing cells in the 
CNS (Naldini et al, 1996; Blomer et al, 1997), liver (Kafri et al, 1997), muscle (Kafri et al, 
1997) and retina (Miyoshi et al, 1997). However, a recent report in xenograft models of 
human airway epithelia suggests that in well-differentiated epithelia, gene transfer with VSV-G 
pseudotyped HIV-based lentivirus is inefficient (Goldman et al, 1997). 

III. Adeno-Associated Virus 

In addition, AAV possesses several unique features that make it more desirable than the 
other vectors: Unlike retroviruses, AAV can infect non-dividing cells; wild-type AAV has been 
characterized by integration, in a site-specific manner, into chromosome 19 of human cells 
(Kotin and Bems, 1989; Kotin et al, 1990; Kotin et al, 1991; Samulski et al, 1991); and AAV 
also possesses anti-oncogenic properties (Ostrove et al, 1981; Berns and Giraud, 1996). 
Recombinant AAV genomes are constructed by molecularly cloning DNA sequences of interest 
3 between the AAV ITRs, eliminating the entire coding sequences of the wild-type AAV 
genome. The AAV vectors thus produced lack any of the coding sequences of wild-type AAV, 
yet retain the property of stable chromosomal integration and expression of the recombinant 
genes upon transduction both in vitro and in vivo (Berns, 1990; Berns and Bohensky, 1987; 
Bertran et al, 1996; Kearns et al, 1996; Ponnazhagan et al, 1997a). Until recently, AAV was 
35 believed to infect almost all cell types, and even cross species barriers. However, it now has 
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5 been determined that AAV infection is receptor-mediated (Ponnazhagan et al, 1 996; Mizukami 
etal, 1996). 



AAV utilizes a linear, single-stranded DNA of about 4700 base pairs. Inverted terminal 
repeats flank the genome. Two genes are present within the genome, giving rise to a number of 
distinct gene products. The first, the cap gene, produces three different virion proteins (VP), 
designated VP-1, VP-2 and VP-3. The second, the rep gene, encodes four non-structural 
proteins (NS). One or more of these rep gene products is responsible for transactivating AAV 
transcription. The sequence of AAV is provided by Srivastava et al. (1983), and in U.S. Patent 
5,252,479 (entire text of which is specifically incorporated herein by reference). 
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The three promoters in AAV are designated by their location, in map units, in the 
genome. These are, from left to right, p5, pl9 and p40. Transcription gives rise to six 
transcripts, two initiated at each of three promoters, with one of each pair being spliced. The 
splice site, derived from map units 42-46, is the same for each transcript. The four non- 
structural proteins apparently are derived from the longer of the transcripts, and three virion 
proteins all arise from the smallest transcript. 



AAV is not associated with any pathologic state in humans. Interestingly, for efficient 
replication, AAV requires "helping" functions from viruses such as herpes simplex virus I and 
25 II, cytomegalovirus, pseudorabies virus and, of course, adenovirus. The best characterized of 
the helpers is adenovirus, and many "early" functions for this virus have been shown to assist 
with AAV replication. Low level expression of AAV rep proteins is believed to hold AAV 
structural expression in check, and helper virus infection is thought to remove this block. 

30 IV. Vaccinia Virus 

Vaccinia viruses are a genus of the poxvirus family. Vaccinia virus vectors have been 
used extensively because of the ease of their construction, relatively high levels of expression 
obtained, wide host range and large capacity for carrying DNA. Vaccinia contains a linear, 
double-stranded DNA genome of about 186 kB that exhibits a marked "A-T" preference. 

35 Inverted terminal repeats of about 10.5 kB flank the genome. The majority of essential genes 
appear to map within the central region, which is most highly conserved among poxviruses. 
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5 Estimated open reading frames in vaccinia virus number from 150 to 200. Although both 
strands are coding, extensive overlap of reading frames is not common. U.S. Patent 5,656,465 
(specifically incorporated by reference) describes in vivo gene delivery using pox viruses. 

V. Papovavirus 

10 The papovavirus family includes the papillomaviruses and the polyomaviruses. The 

polyomaviruses include Simian Virus 40 (SV40), polyoma virus and the human 
polyomaviruses BKV and JCV. Papillomaviruses include the bovine and human 
papillomaviruses. The genomes of polyomaviruses are circular DNAs of a little more than 
5000 bases. The predominant gene products are three virion proteins (VP1-3) and Large T and 

15 Small T antigens. Some have an additional structural protein, the agnoprotein, and others have 
a Middle T antigen. Papillomaviruses are somewhat larger, approaching 8 kB 

Little is known about the cellular receptors for polyomaviruses, but polyoma infection 
can be blocked by treating with sialidase. SV40 will still infect sialidase-treated cells, but JCV 
20 cannot hemagglutinate cells treated with sialidase. Because interaction of polyoma VP1 with 
the cell surface activates cmyc and c-fos, it has been hypothesized that the virus receptor may 
have some properties of a growth factor receptor. Papillomaviruses are specifically tropic for 
squamous epithelia, though the specific receptor has not been identified. 

25 VI. Paramyxovirus 

The paramyxovirus family is divided into three genera: paramyxovirus, morbillivirus 
and pneumovirus. The paramyxovirus genus includes the mumps virus and Sendai virus, 
among others, while the morbilliviruses include the measles virus and the pneumoviruses 
include respiratory syncytial virus (RSV). Paramyxovirus genomes are RNA based and contain 
30 a set of six or more genes, covalently linked in tandem. The genome is something over 1 5 kB 
in length. The viral particle is 150-250 nm in diameter, with "fuzzy" projections or spikes 
protruding therefrom. These are viral glycoproteins that help mediate attachment and entry of 
the virus into host cells. 



35 



A specialized series of proteins are involved in the binding an entry of paramyxoviruses. 
Attachment in Paramyxoviruses and Morbilliviruses is mediated by glycoproteins that bind to 
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sialic acid-containing receptors. Other proteins anchor the virus by embedding hydrophobic 
regions in the lipid bilayer of the cell's surface, and exhibit hemagluttinating and neuraminidase 
activities. In Pnemoviruses, the glycoproptein is heavily glycosylated with O-glycosidic bonds. 
This molecule lacks the exhibit hemagluttinating and neuraminidase activities of its relatives. 



VII. Herpesvirus. 

Because herpes simplex virus (HSV) is neurotropic, it has generated considerable 
interest in treating nervous system disorders. Moreover, the ability of HSV to establish latent 
infections in non-dividing neuronal cells without integrating in to the host cell chromosome or 
otherwise altering the host cell's metabolism, along with the existence of a promoter that is 
1 5 active during latency makes HSV an attractive vector. And though much attention has focused 
on the neurotropic applications of HSV, this vector also can be exploited for other tissues given 
its wide host range. 

Another factor that makes HSV an attractive vector is the size and organization of the 
20 genome. Because HSV is large, incorporation of multiple genes or expression cassettes is less 
problematic than in other smaller viral systems. In addition, the availability of different viral 
control sequences with varying performance (temporal, strength, etc.) makes it possible to 
control expression to a greater extent than in other systems. It also is an advantage that the 
virus has relatively few spliced messages, further easing genetic manipulations. 

HSV also is relatively easy to manipulate and can be grown to high titers. Thus, 
delivery is less of a problem, both in terms of volumes needed to attain sufficient MOI and in a 
lessened need for repeat dosings. For a review of HSV as a gene therapy vector, see Glorioso et 
al. (1995). 

HSV, designated with subtypes 1 and 2, are enveloped viruses that are among the most 
common infectious agents encountered by humans, infecting millions of human subjects 
worldwide. The large, complex, double-stranded DNA genome encodes for dozens of different 
gene products, some of which derive from spliced transcripts. In addition to virion and 
envelope structural components, the virus encodes numerous other proteins including a 
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5 protease, a ribonucleotides reductase, a DNA polymerase, a ssDNA binding protein, a 
helicase/primase, a DNA dependent ATPase, a dUTPase and others. 

HSV genes form several groups whose expression is coordinately regulated and 
sequentially ordered in a cascade fashion (Honess and Roizman, 1974; Honess and Roizman 

10 1975; Roizman and Sears, 1995). The expression of a genes, the first set of genes to be 
expressed after infection, is enhanced by the virion protein number 16, or a-transducing factor 
(Post et al, 1981; Batterson and Roizman, 1983; Campbell et al, 1983). The expression of P 
genes requires functional a gene products, most notably ICP4, which is encoded by the a4 gene 
(DeLuca et al, 1985). y genes, a heterogeneous group of genes encoding largely virion 

1 5 structural proteins, require the onset of viral DNA synthesis for optimal expression (Holland et 
al, 1980). 

In line with the complexity of the genome, the life cycle of HSV is quite involved. In 
addition to the lytic cycle, which results in synthesis of virus particles and, eventually, cell 
20 death, the virus has the capability to enter a latent state in which the genome is maintained in 
neural ganglia until some as of yet undefined signal triggers a recurrence of the lytic cycle. 
Avirulent variants of HSV have been developed and are readily available for use in gene 
therapy contexts (U.S. Patent 5,672,344). 



25 G. Examples 

The following examples are included to demonstrate preferred embodiments of the 
invention. It should be appreciated by those of skill in the art that the techniques disclosed in 
the examples which follow represent techniques discovered by the inventor to function well in 
the practice of the invention, and thus can be considered to constitute preferred modes for its 
30 practice. However, those of skill in the art should, in light of the present disclosure, appreciate 
that many changes can be made in the specific embodiments which are disclosed and still 
obtain a like or similar result without departing from the spirit and scope of the invention. 
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EXAMPLE 1 
Combinatoric gene assembly 

The inventor has developed a strategy of oligomer assembly into larger DNA molecules 
denoted combinatoric assembly. The procedure is carried out as follows: one may design a 
plasmid using one of a number of commercial or public domain computer programs to contain 
the genes, promoters, drug selection, origin of replication, etc. required. SynGene v.2.0 is a 
program that generates a list of overlapping oligonucleotides sufficient to reassemble the gene 
or plasmid (see FIG. 7A-FIG. 7G). For instance, for a 5000 bp gene, SynGene 2.0 can generate 
two lists of 100 component 50 mers from one strand and 100 component 50 mers from the 
complementary strand such that each pair of oligomers will overlap by 25 base pairs. The 
program checks the sequence for repeats and produces a MERMADE input file which directly 
programs the oligonucleotide synthesizer. The synthesizer produces two sets of 96-well plates 
containing the complementary oligonucleotides. A SynGene program is depicted in FIG. 7. 
This program is designed to break down a designer gene or genome into oligonucleotides fore 
synthesis. The program is for the complete synthetic designer gene and is based upon an 
original program for formatting DNA sequences written by Dr. Glen Evans. 

Combinatoric assembly is best carried out using a programmable robotic workstation 
such as a Beckman Biomek 2000. In short, pairs of oligomers which overlap are mixed and 
annealed. Following annealing, a smaller set of duplex oligomers is generated. These are again 
paired and annealed, forming a smaller set of larger oligomers. Sequentially, overlapping 
oligomers are allowed to anneal until the entire reassembly is completed. Annealing may be 
carried out in the absence of ligase, or each step may be followed by ligation. In one 
configuration, oligomers are annealed in the presence of tcpoisomerase 2, which does not 
require 5' phosphorylation of the oligomer, occurs at room temperature, and is a rapid (5 
minute) reaction as opposed to 12 h ligation at 12°. Following the complete assembly, the 
resulting DNA molecule can be used for its designed purpose, usually transformation into a 
bacterial host for replication. The steps in this cycle are outlined in FIG. 3. 

This approach has a major advantage over traditional recombinant DNA based cloning. 
While it is technically feasible to make virtually any modification or mutation in existing DNA 
molecules, the effort required, as will as the high technical skill, make some constructions 

RECTIFIED SHEET (RULE 91) . 
. ISA/EP 



PCT/US98/19312 

WO 99/14318 

49 

5 difficult or tedious. This method, while having been used for many years, is not applicable to 
automated gene cloning or large scale creation or entirely novel DNA sequences. 

EXAMPLE 2 
Production of Artificial Genes 

10 

In one example, the present invention will produce a known gene of about 1000 base 
pairs in length by the following method. A set of oligonucleotides, each of 50 bases, is 
generated such that the entire plus strand of the gene is represented. A second set of 
oligonucleotides, also comprised of 50-mers, is generated for the minus strand. This set is 

15 designed, however, such that complementary pairing with the first and second sets results m 
overlap of "paired" sequences, i.e., each oligonucleotide of the first set is complementary with 
regions from two oligonucleotides of the second set (with the possible exception of the terminal 
oligonucleotides). The region of overlap is set at 30 bases, leaving a 20 base pair overhang for 
each pair. The first and said second set of oligonucleotides is annealed in a single mixture and 

20 treated with a ligating enzyme. 

In another example, the gene to be synthesized is about 5000 base pairs. Each set of 
oligonucleotides is made up of fifty 100-mers with overlapping regions, of complementary 
oligonucleotides, of 75 bases, leaving 25 base "sticky ends." In this embodiment, the 5' 

25 terminal oligonucleotide of the first oligonucleotide set is annealed with the 3' termmal 
oligonucleotide of the second set to form a first annealed product, then the next most 5' 
terminal oligonucleotide of the first set is annealed with the first annealed product to form a 
second annealed product, and the process is repeated until all oligonucleotides of said first and 
. said second sets have been annealed. Ligation of the products may occur between steps or at 

30 the conclusion of all hybridizations. 

In a third example, a gene of 100,000 bp is synthesize from one thousand 100-mers. 
Again, the overlap between "pairs" of plus and minus oligonucleotides is 75 bases, leaving a 25 
base pair overhang. In this method, a combinatorial approach is used where corresponding 
35 pairs of partially complementary oligonucleotides are hybridized in first step. A second round 
of hybridization then is undertaken with appropriately complementary pairs of products from 
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5 the first round. This process is repeated a total of 10 times, each round of hybridization 
reducing the number of products by half. Ligation of the products then is performed. 



10 



15 



25 



30 



35 



EXAMPLE 3 
Large scale expression of human gene products 

Once the human genome has been characterized, functional analysis of the human 
genome, based upon the complete sequence, will require a variety of approaches to structural, 
functional and network biology. The approach proposed herein for producing a series of 
expression constructs representing all potential human gene products and the assembly of sets 
of bacterial and/or yeast expressing these products will provide an important avenue into the 
beginnings of functional analysis. 



Secondly, the approach described here, when developed to its theoretical optima, will 
allow the large scale transfer of genes to cell lines or organisms for functional analysis. The 
20 long term goal of this concept is the creation of living organisms entirely based on 
bioinformatics and information processing. Obviously, the knowledge of the complete 
sequence is not sufficient to appreciate the myriad of biological concepts inherent in life. 



EXAMPLE 4 
Construction of a synthetic plasmid 

A DNA molecule was designed using synthetic parts of previously known plasmids. As 
a demonstration of this technique, plasmid synlux4 was designed. Synlux4 consists of 4800 
base pairs of DNA. Within this sequence are included the sequence of lux A and lx B, the A 
and B components of the luciferase protein from Vibrio Fisherii, potions of plasmid pUC19 
including the origin of replication and replication stability sequences, the promoter and coding 
sequence for tn9 kanamycin/neomycin phosphotransferase. The sequence was designed on a 
computer using Microsoft Word and Vector NT1 (InforMax, Inc.). The sequence is listed in 
FIG. 4A-FIG. 4C. 

Following design, a computer program SynGene 2.0 was used to break the sequence 
down into components consisting of overlapping 50-mer oligonucleotides. From the 4800 base 
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5 pair sequence, 192 50-mers were designed. The component oligonucleotides are listed in FIG. 
5A-FIG 5F These component oligonucleotides were synthesized using a custom 96-well 
oligonucleotide synthesizer (Rayner, et al.) Genome Research, 8, 741-747 (1998). The 
component oligonucleotides were produced in two 96-well microtitre plates, each plate holdmg 
one set of component oligonucleotides. Thus, plate one held the forward strand oligos and plate 

1 0 2 held the reverse strand oligos. 

The oligonucleotides were assembled and ligations carried out using a Biomek 1000 
robotic workstation (Beckman). Sequential transfers of oligonucleotides were done by 
pipetting from one well to a second well of the plate and a ligation reaction carried out using T4 
1 5 ligase. The pattern of assembly is delineated in FIG. 6A-FIG. 6B. 

Following assembly, the resulting ligation mix was used to transform competent E. coli 
strain DH5a. The transformation mix was plated on LB plates containing 25 ug/ml kanamycin 
sulfate, and recombinant colonies obtained. The resulting recombinant clones were isolated, 
20 cloned, and DNA prepared. The DNA was analyzed on 1% agarose gels in order detect 
recombinant molecules. Clones were shown to contain the expected 4800 base pair plasmtd 
containing lux A and B genes. 

* * * 

25 All of the compositions and/or methods disclosed and claimed herein can be made and 

executed without undue experimentation in light of the present disclosure. While the 
compositions and methods of this invention have been described in terms of preferred 
embodiments, it will be apparent to those of skill in the art that variations may be applied to the 
compositions and/or methods and in the steps or in the sequence of steps of the method 
30 described herein without departing from the concept, spirit and scope of the invention. More 
specifically, it will be apparent that certain agents which are both chemically and 
physiologically related may be substituted for the agents described herein while the same or 
similar results would be achieved. All such similar substitutes and modifications apparent to 
those skilled in the art are deemed to be within the spirit, scope and concept of the invention as 
35 defined by the appended claims. 
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CLAIMS : 

1 . A method for the synthesis of a replication-competent, double-stranded polynucleotide, 
wherein said polynucleotide comprises an origin of replication, a first coding region and 
a first regulatory element directing the expression of said first coding region, comprising 
5 the steps of: 



(a) generating a first set of oligonucleotides corresponding to the entire plus strand 
of said double-stranded polynucleotide; 

(b) generating a second set of oligonucleotides corresponding to the entire minus 
1 0 strand of said double-stranded polynucleotide; and 

(c) annealing said first and said second set of oligonucleotides; 

wherein each of said oligonucleotides of said second set of oligonucleotides overlaps 
with and hybridizes to two complementary oligonucleotides of said first set of 
1 5 oligonucleotides, except that two oligonucleotides at a 5' or 3' end of said double- 

stranded polynucleotide will hybridize with only one complementary oligonucleotide. 

2. The method of claim 1 , further comprising the step of treating said annealed 
oligonucleotides with a ligating enzyme to generate continuous strands of said double- 

20 stranded polynucleotide. 

3. The method of claim 1, further comprising the step of amplifying said double-stranded 
polynucleotide. 

25 4. The method of claim 1 , wherein said double-stranded polynucleotide comprises 1 00, 
200, 300, 400, 500, 600, 700, 800, 900, 1000, 5000, 10 x 10 3 , 20 x 10 3 , 30 x 10 3 , 40 x 
10 3 , 50 x 10 3 , 60 x 10 3 , 70 x 10 3 , 80 x 10 3 , 90 x 10 3 , 1 x 10 4 , 1 x 10 5 , 1 x 10 6 , 1 x 10 7 , 1 
x 10 8 , 1 x 1 0 9 or 1 x 10 10 base pairs in length. 



30 5. 



The method of claim 1, wherein said first regulatory element is a promoter. 
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6. The method of claim 5, wherein said double-stranded polynucleotide comprises a 
second regulatory element, said second regulatory element being a polyadenylation 
signal. 

7. The method of claim 1, wherein said double-stranded polynucleotide comprises a 
plurality of coding regions and a plurality of regulatory elements. 



8. 



The method of claim 7, wherein said coding regions encode products that comprise a 
biochemical pathway. 

9. The method of claim 8, wherein said biochemical pathway is glycolysis. 

10. The method of claim 9, wherein said coding regions encode enzymes selected from the 
group consisting of hexokinase, phosphohexose isomerase, phosphofructokinase-1, 
aldolase, triose-phosphate isomerase, glyceraldehyde-3 -phosphate dehydrogenase, 
phosphoglycerate kinase, phosphoglycerate mutase, enolase and pyruvate kinase. 

11. The method of claim 8, wherein said biochemical pathway is lipid synthesis. 

12. The method claim 7, wherein said biochemical pathway is cofactor synthesis. 

13. The method of claim 13, wherein said pathway involves lipoic acid. 

14. The method of claim 13, wherein said biochemical pathway is riboflavin synthesis. 

15. The method of claim 7, wherein said biochemical pathway is nucleotide synthesis. 

1 6. The method of claim 1 5, wherein said nucleotide is a purine. 

17. The method of claim 15, wherein said nucleotide is a pyrimidine. 
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18. The method of claim 7, wherein said coding regions encode enzymes involved in a 
cellular process selected from the group consisting of cell division, chaperone, 
detoxification, peptide secretion, energy metabolism, regulatory function, DNA 
replication, transcription, RNA processing and tRNA modification. 

5 

19. The method of claim 1 8, wherein said energy metabolism is oxidative phosphorylation. 

20. The method of claim 1, wherein said double-stranded polynucleotide is a DNA. 
10 21. The method of claim 1, wherein said double-stranded polynucleotide is an RNA. 

22. The method of claim 1, wherein said double-stranded polynucleotide is an expression 

construct. 

15 23. The method of claim 22, wherein said expression construct is a bacterial expression 
construct. 



24. The method of claim 22, wherein said expression construct is a mammalian expression 
construct. 



20 



25 



30 



25. The method of claim 17, wherein said expression construct is a viral expression 
construct. 



26. The method of claim 1, wherein said double-stranded polynucleotide comprises a 
genome selected from the group consisting of bacterial genome, yeast genome, viral 
genome, mammalian genome, amphibian genome and avian genome. 

27. The method of claim 1, wherein said overlap between the oligonucleotides of said first 
and said second set of oligonucleotides is between about 5 base pairs and about 75 base 
pairs. 



PCT/US98/19312 

WO 99/14318 

63 

28. The method of claim 1, wherein said overlap is about 10 base pairs, about 15 base pairs, 
about 20 base pairs, about 25 base pairs, about 30 base pairs, about 35 base pairs, about 
40 base pairs, about 45 base pairs, about 50 base pairs, about 55 base pairs, about 60 
base pairs, about 65 base pairs, or about 70 base pairs. 

29. The method of claim 5, wherein said promoter is selected from the group consisting of 
CMV IE, SV40 IE, RSV, p-actin, tetracycline regulatable and ecdysone regulatable. 

30. The method of claim 26, wherein said genome is a viral genome. 

31. The method of claim 30, wherein said viral genome is selected from the group 
consisting of retrovirus, adenovirus, vaccinia virus, herpesvirus and adeno-associated 
virus. 

32. The method of claim 1 , wherein said double-stranded polynucleotide is a chromosome. 

33 . A method of producing a viral particle comprising the steps of: 

(a) providing a host cell; 

(b) transforming said host cell with an artificial viral genome prepared by: 

(i) generating a first set of oligonucleotides corresponding to the entire plus 
strand of said viral genome; 

(ii) generating a second set of oligonucleotides corresponding to the entire 
minus strand of said viral genome; and 

(iii) annealing said first and said second set of oligonucleotides; 

wherein each of said oligonucleotides of said second set of oligonucleotides 
overlaps with and hybridizes to two complementary oligonucleotides of said first 
set of oligonucleotides, except that two oligonucleotides at a 5' or 3' end of said 
viral genome will hybridize with only one complementary oligonucleotide; and 
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(c) culturing said transformed host cell under conditions such that said viral 
particle is expressed. 



34. The method of claim 33, wherein said viral genome is selected from the group 
consisting of retrovirus, adenovirus, vaccinia virus, herpesvirus and adeno-associated 
virus. 

35. A method of producing an artificial genome, wherein said chromosome comprises all 
coding regions and regulatory elements found in a corresponding natural chromosome, 
comprising the steps of: 

(a) generating a first set of oligonucleotides corresponding to the entire plus strand 
of said chromosome; 

(b) generating a second set of oligonucleotides corresponding to the entire minus 
strand of said chromosome; and 

(c) annealing said first and said second set of oligonucleotides; 

wherein each of said oligonucleotides of said second set of oligonucleotides overlaps 
with and hybridizes to two complementary oligonucleotides of said first set of 
oligonucleotides, except that two oligonucleotides at a 5 1 or 3' end of said chromosome 
will hybridize with only one complementary oligonucleotide. 

36. The method of claim 35, wherein said corresponding natural chromosome is a human 
mitochondrial genome. 

37. The method of claim 35, wherein said corresponding natural chromosome is a 
chloroplast genome. 



38. 



A method of producing an artificial genetic system, wherein said system comprises all 
coding regions and regulatory elements found in a corresponding natural biochemical 
pathway, comprising the steps of: 
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(a) generating a first set of oligonucleotides corresponding to the entire plus strand 
of said chromosome; 

(b) generating a second set of oligonucleotides corresponding to the entire minus 
strand of said chromosome; and 

5 (c) annealing said first and said second set of oligonucleotides; 

wherein each of said oligonucleotides of said second set of oligonucleotides overlaps 
with and hybridizes to two complementary oligonucleotides of said first set of 
oligonucleotides, except that two oligonucleotides at a 5' or 3' end of said chromosome 
1 0 will hybridize with only one complementary oligonucleotide 

wherein expression of said biochemical pathway coding regions results in the 
expression of a group of enzymes that serially metabolize a compound. 

15 39. The method of claim 38, wherein said biochemical pathway comprises the activities 
required for glycolysis. 

40. The method of claim 3 8, wherein said biochemical pathway comprises the enzymes 
required for electron transport. 



20 



41. The method of claim 38, wherein said biochemical pathway comprises the enzyme 
activities required for photosynthesis. 



WO 99/14318 



PCT/US98/19312 

1/21 



DNA SEQUENCE INFORMATION 



Step 1. Determine/design DNA sequence of the genome 

I 

Step 2. Synthesize and assemble the genomic DNA 

1 

Step 3. Introduce the DNA into an enucleated 
pi euri potent host cell. 

1 

Step 4. Introduce the host cell into a foster mother 
animal 



SYNTHETIC ORGANISM 

FIG.1 
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1 Design genome, containing prokaryotic origin of 
replication and drug selection vector. 

2 SynGen 2.0, breaks down genome into component 
overlapping oligonucleotides, programs 
oligonucleotide synthesizer. 

3 Chemcial synthesis of component oligonucleotide 
using MERMADE high throughput synthesizer. 

4. Combinatory assembly of component oligonucleotides 
using robotic processing. 

5. Transformation into component bacteria. 



FIG. 2 
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FIG. 3 
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i goFOl , MGCTTACCTCGATTTGAGGACGTTACAAGTATTACTGTTAA6GAGCGTA 
i goF02 , GAnAAAAMTGAMTTGAAAATGAATTATTAGAATTGGCTTAAATAAAC 
i goF03 , AGMTCACCAAAMGGMTAGAGTATGMGTTTGGAMTATnGTTTTTC 
i goF04 . GTATCAACCACCAGGTGAAACTCATAAGCTAAGTAATGGATCGCTTTGTT 
i goF05 . CGGCTTGGTATCGCCTCAGAAGAGTAGGGTTTGATACATATTGGACCTTA 
i goF06 , GMCATCATTrTACAGAGTnGGTCTTACGGGAMTTTATTTGTTGCTGC 
i goF07 , GGCTMCCTGTTAGGAAGAACTAAAACATTAAATGTTGGCACTATGGGGG 
i goF08 . TTGnATTCCGACAGCAGACCCAGnCGACAGTTAGMGACGTTTTATTA 
i goF09 . TTAGATCAMTGTCGAMGGTCGTTTTAATTTTGGAACCGTTCGAGGGCT 
i goFlO . ATACCATAMGATTTTCGAGTATTTGGTGTTGATATGGAAGAGTCTCGAG 
i goFll , CAATTACTCAAAATTTCTACCAGATGATAATGGAAAGCTTACAGACAGGA 
i goF12 . ACCATOGCTCTGATAGTGAnACATTCAATTTCCTAAGGTTGATGTATA 
i goF13 , TCCCAAAGTGTACTCAAAAAATGTACCAACCTGTATGACTGCTGAGTCCG 
i goF14 . CAAGTACGACAGAATGGCTAGCAATACAAGGGCTACCAATGGTTCTTAGT 
i goF15 . TGGATOTOGTACTMTGAAAAAAMGCA^GATGGAACTCTATAATGA 
i goF16 , AATTGCGACAGAATATGGTCATGATATATCTAAAATAGATCATTGTATGA 
i goF17 . CTTATATnGTTCTGTTGATGATGATGCACAAAAGGCGCAAGATGTTTGT 
i goF18 . CGGGAGTTTCTGAAAAATTGGTATGACTCATATGTAAATGCGACCAATAT 
i goF19 . CTTTAATGATAGCAATCAAACTCGTGGTTATGATTATCATAAAGGTCAAT 
i goF20 , GGCGTGATTnGTTTTACAAGGACATACAAACACCAATCGACGTGTTGAT 
i goF21 . TATAGCAATGGTATTAACGCTGTAGGCACTCCTGAGCAGTGTATTGAAAT 
i goF22 . CATTCAACGTGATATTGATGCAACGGGTATTACAAACATTACATGCGGAT 
i goF23 . TTGAAGCTAATGGAACTGAAGATGAAATAATTGCTTCCATGCGACGCTTT 
i goF24 . ATGACACMGTCGCTCCTTTCTTAAAAGAACCTAAATAAATTACTTATTT 
i goF25 . GATACTAGAGATMTMGGMWGTTATGAMTTTGGAnATTTTTTCT 
i goF26 . AAACTTTCAGAAAGATGGAATAACATCTGAAGAAACGTTGGATAATATGG 
i goF27 . TAMGACTGTCACGTTAATTGATTCAACTAAATATCATTTTAATACTGCC 
i goF28 . TTTGTTMTGMCATCACTTTTCAAAAAATGGTATTGTTGGAGCACCTAT 
i goF29 . TACCGCAGCTGGT1TTTTATTAGGGTTAACAAATAAATTACATATTGGTT 



FIG. 5A 



SUBSTITUTE SHEET (RULE 26) 



W0»/1«.8 PCT/U S 98,I»12 

8/21 

ol igoF30 , CATT/W\TCAAGTMnACWCCCATCACCCTGTOCGTCT^WGAAW 

oligoF31 •GCC/^TnATTAGATCAMTGTWGAGGGACGOTCAKOT 

ol i goF32 , TGACTGCGAAAGTGATnCGAMTGGMTTTTTTAMGTCATATCT 

0 igoFSS.CAAGGCMCMCMTHGTO^ 

n innF-M ACTACAGGTTATTGTWTCCCCAAMCGACTTnATGATTTTCCAAAGGT 

01 i qoF35 ^ CAAnMTCCACACTGTTACAGTGAGAATGGACCTAAGCMTATGT AT 
nl iaoF36 CCGCTACATCAAAAGAAGTCGTCATGTGGGCAG(BAAAAAGGCACTGCCT 

n iaoF38 TCTATATMTAAMCAGCACMCMTATGGTATTGATATnCGGATGJTG 
nl i aoF39 ATWT WTTMCTGTMTTGCGMCTTAMTGCTGATAGMGTACGGCT 

0 J caaS 

nl i anF41 TCAMTGGACAGAGATGAAAAMnMCTGCATTATTGAAGAGAATGCAG 
F ^ 

n anF43 AWGGGTCTAAAMTATTTTATTATCCTnGMTCMTGTCCGATATTAA 

0 ligoF46.ATCCTCTAavWAT™™^^ 

01 i goF47 . TCGTTTTAr ■ """" rr 
oligoF48,CGCCTTGC 

ol i goF49 , CCGCACCGATCGCCCTTtaAALAb 1 1 <*^X^™ t'a !rrrr 
™rai rrrTrATffGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGC 



Ol 

ol 

oligoF53 



nlionre'cGTGATAffiCCTATTTTTATAGGnMTGTOT 

, :„cc7 TTATTrnCTAAAAAGCTTCACGCTGCCGCAAGCACTCAGGGCGCAAGGG 
■/W^GCGGAACACGTAGAAAGCCAGTCCGCAGAAACGGTGCT 
trratgaaTGTCAGCTACTGGGCTATCTGGACAAGGGAAAACGCA 



oligoF57.TTAl 
oligoF58.CTGCT, 
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i goF63 . TTGCCGCCAAGGATCTGATGGCGCAG6GGATCAAGATCTGATCAAGAGAC 
i goF64 , AGGATGAGGATCGTTTCGCATGATTGAACAAGATGGATTGCACGCAGGTT 
i goF65 . CTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAG 
i goF66 , ACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCG 
i goF67 , CCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGC 
i goF68 , AGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGC 
i goF69 . GCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATT 
i goF70 . GGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCG 
i goF71 . AGAAAGTATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGAT 
i goF72 , CCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATCGAGCGAGC 
i goF73 . ACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAG 
i goF74 . AGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGC 
i goF75 . ATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCC 
i goF76 . GMTATCATGGTGGAAMTGGCCGCTTTTCTGGATTCATCGACTGTGGCC 
i goF77 . GGCTGGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGAT 
i goF78 . ATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTA 
i goF79 , CGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTG 
i goF80 . ACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGA 
igoF81 . CGCCCAACCTGCCATCACGAGATTTCGATTCCACCGCCGCCTTCTATGAA 
i goF82 , AGGTTGGGCTTCGGAATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCA 
i goF83 . GCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCCGGGCATGACCAAA 
i goF84 . ATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAA 
i goF85 . GATCAAAGGATCTTCTTGAGATCC I II 1 1 1 1 CTGCGCGTAATCTGCTGCT 
i goF86 , TGCAMCAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAA 
i goF87 , GAGCTACCMCTCTTTTTCCGAAGGTMCTGGCTTCAGCAGAGCGCAGAT 
i goF88 . ACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGA 
i goF89 . ACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTG 
i goF90 . GCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACG 
i goF91 . ATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCA 
i goF92 . CACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAG 
i goF93 . CGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAG 
i goF94 . GTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTC 
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ol i goR04 . AGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTC/CTTCGGTG1 A 
ol i goR05 GGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCG 
ol i goR06 ACCGCTGCGCCTTATCCGGTMCTATCGTCTTGAGTCCMCCCGGTAAGA 
ol iqoR07 CACGACTTATCGCCACTGGCAGCAGCCACTGGTMCAGGATTAGCAGAGC 
oligoR08 GAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACG 
ol i goR09 GCTACACTAGMGGACAGTATnGGTATCTGCGCTCTGCTGMGCCAGTT 
ol iaoRlO ' ACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGC 
ol IgoRll ' TGGTAGCGGTGGTTTTTTTGTnGWGCAGCAGATTACGCGCAGAAAAA 



ol igoRll JGGTAGCGGTGGT 1 1 1 1 1 lull luLAAbuwj^. 
ol i goR12 AAGGATCTCAAGMGATCCTnGATCTTTTCTACGGGGTCTGACGCTCAG 
ol i qoR13 TGGMCGAAMCTCACGTTAAGGGATTTTGGTCATGCCCGGGGTGGGCGA 
ol i qoR14 ' AGAACTCCAGCATGAGATCCCCGCGCTGGAGGATCATCCAGCCGGCGTCC 
oligoR15XGGAAAACGAmCGM(XCCMCCmCATAGMGGCGGCGCT^ 
ol i goR16 , GAAATCTCGTGATGGCAGGnGQXGTCGCTTGGTCGGTW^CGMCC 
<Ln 7 rriRARTrrrfirJCAGAAGAACTCGTCAAGAAGGCGATAGAAGGCGATGC 
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ol igoR33 . CAATCATGCGAAACGATCCTCATCCTGTCTCTTGATCAGATCTTGATCCC 
ol i goR34 , CTGGGCCATCAGATCCTTGGCGGCMGAMGCCATCCAGTTTACTTTGCA 
ol i goR3.5 , GGGCTTCCCAACCTTACCAGAGGGCGCCCCAGCTGGCAATTCCGGTTCGC 
ol i goR36 . TTGCTGTCCATAAAACCGCCCAGTCTAGCTATCGCCATGTAAGCCCACTG 
ol i goR37 , CMGCTACCTGCmCTCTTTGCGCTTGCGTTTTCCCnGTCCAGATAGC 
ol i goR38 , CCAGTAGCTGACATTCATCCGGGGTCAGCACCGTTTCTGCGGACTGGCTT 
ol i goR39 . TCTACGTGTTCCGCTTCCTTTAGCAGCCCTTGCGCCCTGAGTGCTTGCGG 
ol i goR40 . CAGCGTGMGCTTTTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATT 
Ol i goR41 . TCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCATTATTATCATGACAT 
ol i goR42 . TAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCTCGCGCGTTTC 
ol igoR43 , GGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCAC 
ol igoR44 .AGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGT 
ol i goR45 , CAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGGCATCAGAG 
ol i goR46 . CAGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATG 
ol i goR47 . CGTAAGGAGAAAATACCGCATCAGGCGCCATTCGCCATTCAGGCTGCGCA 
ol i goR48 , ACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTG 
Ol i goR49 , GCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTT 
ol i goR50 . TTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAATTCTCATCTTATT 
ol i goR51 . AATCAGATAAAATATTTCTAGAGGATCCCCAAAAAGGCAATCTAATATAG 
ol i goR52 . AAATTGCCmMTmAnATGGTAMmAmC^TTTTTTGGnCA 
ol i goR53 , ACATATCMTMTATCTTTrACATCTTTAATATCGGACATTGATTCAAAG 
ol i goR54 , GATMTAAMTATTTTTAGACCCTGTTTTTTCCACTGCTMTTTT 
ol i goR55 . TTCATAATAGTCATCATGAGACCCAACTGCATTCTCTTCAATAATGCAGT 
ol i goR56 , TMTTTTTTCATCTCTGTCCATnGAGGGTMGlTTCAGTGATATAGTCT 
ol i goR57 , TnMGTAnCTCTCACnCTTCTTGAGCCGTACTTCTATCAGCATTTAA 
ol i goR58 , GTTCGCAATTACAGTTMTTGATGATCMCATCCGAAATATCAATACCAT 
ol i goR59 , ATTGTTGTGCTGTTnAnATATAGMTTGCATAGCGTTCTTTGGTTTCT 
ol i goR60 , AMTOTCCTCCCACnAMTGTTAMGGCAGTGCCTTTTTCGCTGCCCA 
ol i goR61 , CATGACGACnCTTTTGATGTAGCGGATACATATTGCTTAGGTCCATTCT 
ol i goR62 . CACTGTAACAGTGTGGATTAATTGAAACCTTTGGAAAATCATAAAAGTCG 
ol i goR63 . TTnGGGGATGACMTMCCTGTAGnMTGCGTCATTMTTATTTCATA 
ol i goR64 . GCATGCTTCAMnGnGTTGCCTTGATGAGATATGACGTCTAAAAAATT 
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ol i QOR66 TCTGACATTTGATCTAATAAACTGGCTTCTTCTGCTACACGTACA6GGTG 
0 ligoR67 ATGGGTGGTMnACmTnAATGAACCAATATGTAATnATTTGTTA 
ol igoR68 ACCCTAATAAAAAACCAGCTGCGGTAATAGGTGCTCCAACAATACCATTT 
oligoR69 TnGAAMGTGATGnCAnMCAMGGCAGTATTAAAATGATATTTAGT 
ol i goR70 ' TGMTCMTTMCGTGACAGTCTTTACCATATTATCCAACGTTTCTTCAG 
oligoR71 ATGnATTCCATCTTTCTGAMGTTTAGAAAAAATAATCCAAATTTCATA 
ol igoR72 ACnGnCCnATTATCTCTAGTATCAMTMGTMTTTATTTAGGTTCT 
ol i QOR73 ' TTTAAGAAAGGAGCGACTTGTGTCATAAAGCGTCGCATGGAAGCAATTAT 
ol igoR74 nCATCnCAGTTCCAnAGCTTCAAATCCGCATGTAATGTTTGTAATAC 
ol igoR75 CCGnGCATCMTATCACGTTGAATGATTTCAATACACTGCTCAGGAGTG 
ol igoR76 CCTACAGGGTTMTACCAnGCTATMTCMCACGTCGATTGGTGTTTGT 
ol igoR77 ' ATGTCCTTGTAAAACAAAATCACGCCATTGACCTTTATGATAATCATAAC 
ol i goR78 ' CACGAGTnGAnGCTATCATTAMGATATTGGTCGCATTTACATATGAG 
ol i goR79 ' TCATACCMT1TTTCAGAAACTCCCGACAAACATCTTGCGCCTTTTGTGC 
ol i goR80 ' ATCATCATCAACAGAACAAATATAAGTCATACAATGATCTATTTTAGATA 
ol igoR81 TATCATGACCATAnCTGTCGC^TTTCATTATAGAGTTCCATCTGTGCT 
ol i qoR82 ' TTTT1TTCATTAGTACCAATAATCCAACTAAGAACCATTGGTAGCGCTTG 
ol i goR83 ' TATTGCTAGCCAnCTGTCGTACTTGCGGACTCAGCAGTCATACAGGTTG 
ol i goR84 , GTACATTTTnGAGTACACTnGGGATATACATCMCCnAGGAMTTGA 
ol iqoR85 ATGTAATCACTATCAGAGCTAATGGTTCCTGTCTGTAAGCTTTCCATTAT 
ol i goR86 ' CATCTGGTAGAMTTTTGAGTAATTGCTCGAGACTCTTCCATATCAACAC 
ol i goR87 ' CAAATACTCGAAAATCTTTATGGTATAGCCCTCGAACGGTTCCAAAATTA 
ol i goR88 * AAACGACCTTTCGACATTTGATCTAATAATAAAACGTCTTCTAACTGTCG 
ol i goR89 ' AACTGGGTGTGCTGTCGGAATAACAACCCCCATAGTGCCAACATTTAATG 
ol i goR90 ' TITTAGnCTTCCTAACAGGTTAGCCGCAGCAACAAATAAATTTCCCGTA 
ol igoR91 ' AGACCAAACTCTGTAAAATGATGTTCTAAGGTCCAATATGTATCAAACCC 
ol i goR92 TACTCTTCTGAGGCGATACCAAGCCGAACAAAGCGATCCATTACTTAGCT 
Ol i goR93 ' TATGAGTTTCACCTGGTGGTTGATACGAAAAACAAATATTTCCAAACTTC 
oligoR94 ATACTCTATTCCTTTnGGTGAnCTGTnATTTAAGCCAATTCTAATAA 
ol i QOR95 ' TTCATTnCMTTTCATTTTnMTCTACGCTCCTTMCAGTMTACTTG 
ol i goR96 ' TAACGTCCTCAAATCGAGGTAAGCTTCATAGGCTCCGCCCCCCTGACGAG 
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Instruction set for 192 oligos (96 pairs). 

1. -F Al --> -C Al 
-F A2 --> -C A2 
-F A3 --> -C A3 
-F A4 --> -C A4 

repeat with all wells to H12 

-R Al --> -C Al 

-R A2 --> -C A2 

-R A3 --> -C A3 

-R A4 --> -C A4 

repeat with all wells to H12 

All remaining operations on -C plate 

2. Al --> A2 
A3 --> A4 
A5 --> A6 
A7 --> A8 
A9 --> A10 
All --> A12 

repeat with each letter 

3. A2 --> A4 
A6 --> A8 
A10 --> A12 

repeat with each letter 

4. A4 --> A8 
A12 --> B4 
B8 --> B12 
C4 --> C8 
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C12 --> D4 
D8 --> D12 
E4 --> E8 
E12 --> F4 
F8 --> F12 
G4 --> G8 
612 --> H4 
H8 -> H12 

5. A8 --> B4 
B12 --> C8 
D4 --> D12 
E8 --> F4 
F12 -> 68 
H4 --> H12 

6. B4 --> C8 
D12 --> F4 
68 --> H12 

7. C8 --> H12 
F4 --> H12 
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program $yn_Gene_Formatter (input, output f, g, h); 

{Synthetic Gene Formatting Program} 
{This is a draft experimental program designed to break 
down a designer gene or genome} {into oligonucleotides for 
synthesis. The program is for complete synthetic 
designer gene} {construction. " The program is based upon 
an original program for formatting DNA sequences} {written 
in 1988 by G. Evans for DNA analysis and formatting} 
{This program is copyright (c) 1997 Glen A. Evans. All 
rights reserved} 

const 

maxlength = 5000; {maximum length of sequence} 
searchlength = 10; {maximum length of search string} 

var 

f: text; {inputfile of sequence} 

g: text; {output file of sequence} 

h: text; {output file of sequence} 

{arrays for sequence formatting} 

dna: array[l . .maxlength] of char; 
rdna: array[l. .maxlength] of char; 
oligo: array[l. .100] of char; 

i , k, seqlength: integer; 
nucin: char; 

oligolength, offset: integer; 
infile, outfile: string 

FIG.7A 
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procedure initialize; 

(This procedure initializes the program and opens the 
input file} 

var 

s: string 

begin 
repeat 

write('>') 
readin(s) ; 

until length(s) = 0; . 
writein( 'Welcome to Syn_6ene_Formatter Version 1.0 - 

copyright (c) Glen A. Evans 1997'); 

writeC Enter the input file name: '); 

readin(infile) ; 

write ('Enter the outputfile name: '): 

readin(outfile); . 

writeC Enter the length of oligos you wish to use. 

'); 

readin(oligolength); 

write ('Enter the reverse oligo offset value: ), 

readin(offset); 

writeinC Thank you. '): 

writeCThe program will now format the sequence into 
oligoncleotide fragments of length '); 
wr.ite(oligolength); 
writein; 
writein; 

end; {initialize} 

procedure readinseq; 

var 

j: integer; 

nuc: char; Mb. /b 
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begin 

writein( 'reading input file'); 
seqlength:=l: 
while not eof(f) do 
begin 

read(f. nuc); 
if nuc <> ' 'then 
begin 

if nuc = 'G' then 

dna[seqlength] := nuc; 
if nuc = 'A' then 

dna[seqlength] := nuc; 
if nuc = T' then 

dna[seqlength] := nuc; 
if nuc = 'C then 

dna[seqlength] := nuc; 
if nuc = 'X' then 

dna[seqlength] := nuc; 
if nuc - 'N' then 

dna[seqlength] := nuc; 
seqlength : = seqlength + 1; 
end; 
end; 

seqlength := seqlength - 1; 
end; {readinseq} 
procedure readinfile; 
begin 

reset(f, infile); 
readinseq; 
close(f) ; 

FIG. 7C 
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procedure writeforseq; 
var 

i. h. b. on: integer; 

begin , , . . 

write ('fragmenting sequence into forward oligos 

b: = 1; 
on:= 1; 

rewrite(g, outfile); 
writein(g, infile); 

while b < seqlength + 1 do 
begin 

writer.'); 

write(g, 'Foligo No. ' . on, * . '): 
begin 

for h:= 1 to oligolength do 
begin 

write(g, dna[b]]); 
b:- b + 1; 
end; 

on:= on + 1; 
writein(g); 
end; 
end; 
writeln; 

end; {writeforseq} 

FIG. 7D 
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procedure reverseseq; 

{This procedure generates the reverse complement of the 
sequence} 

var 

i, h, b, a, on: integer; 
begin 

write( 'generating the reverse complement'); 
b := seqlength; 
for a := 1 to seqlength do 
begin 

if dna[b] = 'G' then 

rdna[a] : = 'C; 
if dna[b] = 'A' then 

rdnaCa] : = 'T'; 
if dna[b] = T then 

rdna[a] := 'A'; 
if dna[b] = T then 

rdna[a] := 'G*; 
b := b - 1: 
writer. 1 ): 
end; 
writeln; 

end; {reverseseq} 
procedure writerevseq; 

FIG. 7E 



SUBSTITUTE SHEET (RULE 26) 



WO 99/14318 PCT/US98/19312 

20/21 

{This procedure fragments the reverse complement sequence 
starting at the offset value} 

var 

i. h. b. on: integer: 

^whte ('fragmenting sequence into reverse oligos 1 ); 

on := 1; 
b := offset: 

while b < sesqlength do 
begin 

writeln(g): 

write(g, 'Roligo No.' . on. ' . '): 
begin 

for h := 1 to oligolength do 
begin 

write(g. rdna[b]); 
b := b + 1: 
end; 

on := on + 1; 
writer . '); 
end; 
end; 

end; {writerevseq} 
procedure finaloligo; 



var 

b. a: integer: 
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begin 

writein; 

write-in ('generating the last portion of the final 
oligo. . . ' ); 

for a := 1 to offset do 
begin 

writeCg, rdna[a]) 
end; 

writein(g) ; 
close(g); 

end; {final oligo} 

begin {main} 

initialize; 

readinfile; 

writeforseq; 

reverseseq; 

writerevseq; 

finaloligo; 

wri teleprocessing completed'); 
writein( 'Have a nice day . '); 

end. {main} 
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SEQUENCE LISTING 

<110> Evans, Glen A. 

<120> METHOD FOR THE COMPLETE CHEMICAL SYNTHESIS AND ASSEMBLY 
OF- GENES AND GENOMES 

<130> UTFD:572P 

<14 0> Unknown 
<141> 1998-09-16 

<150> US 60/059,017 
<151> 1997-09-16 

<160> 193 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 4800 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
plasmid 

<400> 1 

aagcttacct cgatttgagg acgttacaag tattactgtt aaggagcgta gattaaaaaa 60 
tgaaattgaa aatgaattat tagaattggc ttaaataaac agaatcacca aaaaggaata 120 
gagtatgaag tttggaaata tttgtttttc gtatcaacca ccaggtgaaa ctcataagct 180 
aagtaatgga tcgctttgtt cggcttggta tcgcctcaga agagtagggt ttgatacata 240 
ttggacctta gaacatcatt ttacagagtt tggtcttacg ggaaatttat ttgttgctgc 300 
ggctaacctg ttaggaagaa ctaaaacatt aaatgttggc actatggggg ttgttattcc 360 
gacagcacac ccagttcgac agttagaaga cgttttatta ttagatcaaa tgtcgaaagg 420 
tcgttttaat tttggaaccg ttcgagggct ataccataaa gattttcgag tatttggtgt 480 
tgatatggaa gagtctcgag caattactca aaatttctac cagatgataa tggaaagctt 540 
acagacagga accattagct ctgatagtga ttacattcaa tttcctaagg ttgatgtata 600 
tcccaaagtg tactcaaaaa atgtaccaac ctgtatgact gctgagtccg caagtacgac 660 
agaatggcta gcaatacaag ggctaccaat ggttcttagt tggattattg gtactaatga 720 
aaaaaaagca cagatggaac tctataatga aattgcgaca gaatatggtc atgatatatc 780 
taaaatagat cattgtatga cttatatttg ttctgttgat gatgatgcac aaaaggcgca 840 
agatgtttgt cgggagtttc tgaaaaattg gtatgactca tatgtaaatg cgaccaatat 900 
ctttaatgat agcaatcaaa ctcgtggtta tgattatcat aaaggtcaat ggcgtgattt 960 
tgttttacaa ggacatacaa acaccaatcg acgtgttgat tatagcaatg gtattaaccc 1020 
tgtaggcact cctgagcagt gtattgaaat cattcaacgt gatattgatg caacgggtat 1080 
tacaaacatt acatgcggat ttgaagctaa tggaactgaa gatgaaataa ttgcttccat 1140 
gcgacgcttt atgacacaag tcgctccttt cttaaaagaa cctaaataaa ttacttattt 1200 
gatactagag ataataagga acaagttatg aaatttggat tattttttct aaactttcag 1260 
aaagatggaa taacatctga agaaacgttg gataatatgg taaagactgt cacgttaatt 1320 
gattcaacta aatatcattt taatactgcc tttgttaatg aacatcactt ttcaaaaaat 1380 
ggtattgttg gagcacctat taccgcagct ggttttttat tagggttaac aaataaatta 1440 
catattggtt cattaaatca agtaattacc acccatcacc ctgtacgtgt agcagaagaa 1500 
gccagtttat tagatcaaat gtcagaggga cgcttcattc ttggttttag tgactgcgaa 1560 
agtgatttcg aaatggaatt ttttagacgt catatctcat caaggcaaca acaatttgaa 1620 
gcatgctatg aaataattaa tgacgcatta actacaggtt attgtcatcc ccaaaacgac 1680 
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ttttatgatt ttccaaaggt ttcaattaat ccacactgtt acagtgagaa tggacctaag 1740 
caatatgtat ccgctacatc aaaagaagtc gtcatgtggg cagcgaaaaa ggcactgcct 1800 
ttaacattta agtgggagga taatttagaa accaaagaac gctatgcaat tctatataat 1860 
aaaacagcac aacaatatgg tattgatatt tcggatgttg atcatcaatt aactgtaatt 1920 
gcgaacttaa atgctgatag aagtacggct caagaagaag tgagagaata cttaaaagac 1980 
tatatcactg aaacttaccc tcaaatggac agagatgaaa aaattaactg cattattgaa 2040 
gagaatgcag ttgggtctca tgatgactat tatgaatcga caaaattagc agtggaaaaa 2100 
acagggtcta aaaatatttt attatccttt gaatcaatgt ccgatattaa agatgtaaaa 2160 
gatattattg atatgttgaa ccaaaaaatc gaaatgaatt taccataata aaattaaagg 222 0 
caatttctat attagattgc ctttttgggg atcctctaga aatattttat ctgattaata 2280 
agatgagaat tcactggccg tcgttttaca acgtcgtgac tgggaaaacc ctggcgttac 2340 
ccaacttaat cgccttgcag cacatccccc tttcgccagc tggcgtaata gcgaagaggc 2400 
ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatggc gcctgatgcg 2460 
gtattttctc cttacgcatc tgtgcggtat ttcacaccgc atatggtgca ctctcagtac 2520 
aatctgctct gatgccgcat agttaagcca gccccgacac ccgccaacac ccgctgacgc 2580 
gccctgacgg gcttgtctgc tcccggcatc cgcttacaga caagctgtga ccgtctccgg 264 0 
gagctgcatg tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgagac gaaagggcct 2700 
cgtgatacgc ctatttttat aggttaatgt catgataata atggtttctt agacgtcagg 2760 
tggcactttt cggggaaatg tgcgcggaac ccctatttgt ttatttttct aaaaagcttc 2820 
acgctgccgc aagcactcag ggcgcaaggg ctgctaaagg aagcggaaca cgtagaaagc 2880 
cagtccgcag aaacggtgct gaccccggat gaatgtcagc tactgggcta tctggacaag 294 0 
ggaaaacgca agcgcaaaga gaaagcaggt agcttgcagt gggcttacat ggcgatagct 3 000 
agactgggcg gttttatgga cagcaagcga accggaattg ccagctgggg cgccctctgg 3060 
taaggttggg aagccctgca aagtaaactg gatggctttc ttgccgccaa ggatctgatg 3120 
gcgcagggga tcaagatctg atcaagagac aggatgagga tcgtttcgca tgattgaaca 3180 
agatggattg cacgcaggtt ctccggccgc ttgggtggag aggctattcg gctatgactg 3240 
ggcacaacag acaatcggct gctctgatgc cgccgtgttc cggctgtcag cgcaggggcg 33 00 
cccggttctt tttgtcaaga ccgacctgtc cggtgccctg aatgaactgc aggacgaggc 33 60 
agcgcggcta tcgtggctgg ccacgacggg cgttccttgc gcagctgtgc tcgacgttgt 3420 
cactgaagcg ggaagggact ggctgctatt gggcgaagtg ccggggcagg atctcctgtc 3480 
atctcacctt gctcctgccg agaaagtatc catcatggct gatgcaatgc ggcggctgca 3540 
tacgcttgat ccggctacct gcccattcga ccaccaagcg aaacatcgca tcgagcgagc 3600 
acgtactcgg atggaagccg gtcttgtcga tcaggatgat ctggacgaag agcatcaggg 3660 
gctcgcgcca gccgaactgt tcgccaggct caaggcgcgc atgcccgacg gcgaggatct 3720 
cgtcgtgacc catggcgatg cctgcttgcc gaatatcatg gtggaaaatg gccgcttttc 3780 
tggattcatc gactgtggcc ggctgggtgt ggcggaccgc tatcaggaca tagcgttggc 3840 
tacccgtgat attgctgaag agcttggcgg cgaatgggct gaccgcttcc tcgtgcttta 3900 
cggtatcgcc gctcccgatt cgcagcgcat cgccttctat cgccttcttg acgagttctt 3960 
ctgagcggga ctctggggtt cgaaatgacc gaccaagcga cgcccaacct gccatcacga 4020 
gatttcgatt ccaccgccgc cttctatgaa aggttgggct tcggaatcgt tttccgggac 4080 
gccggctgga tgatcctcca gcgcggggat ctcatgctgg agttcttcgc ccaccccggg 4140 
catgaccaaa atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa 4200 
gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct tgcaaacaaa 4260 
aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa ctctttttcc 4320 
gaaggtaact ggcttcagca gagcgcagat accaaatact gtccttctag tgtagccgta 4380 
gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc tgctaatcct 4440 
gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg 4500 
atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca cacagcccag 4560 
cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat gagaaagcgc 4620 
cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg tcggaacagg 4680 
agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt 4740 
tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 4800 



<210> 2 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleot ide 

<400> 2 

aagcttacct cgatttgagg acgttacaag tattactgtt aaggagcgta 

<210> 3 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 3 

gattaaaaaa tgaaattgaa aatgaattat tagaattggc ttaaataaac 

<210> 4 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 4 

agaatcacca aaaaggaata gagtatgaag tttggaaata tttgtttttc 

<210> 5 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 5 

gtatcaacca ccaggtgaaa ctcataagct aagtaatgga tcgctttgtt 

<210> 6 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 6 

cggcttggta tcgcctcaga agagtagggt ttgatacata ttggacctta 
<210> 7 
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<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 

<400> 7 

gaacatcatt ttacagagtt tggtcttacg ggaaatttat ttgttgctgc 

<210> 8 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 8 

ggctaacctg ttaggaagaa ctaaaacatt aaatgttggc actatggggg 

<210> 9 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 9 

ttgttattcc gacagcacac ccagttcgac agttagaaga cgttttatta 

<210> 10 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 10 

ttagatcaaa tgtcgaaagg tcgttttaat tttggaaccg ttcgagggct 

<210> 11 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 11 
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ataccataaa gattttcgag tatttggtgt tgatatggaa gagtctcgag 



50 



<210> 12 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 12 

caattactca aaatttctac cagatgataa tggaaagctt acagacagga 



<210> 13 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 13 

accattagct ctgatagtga ttacattcaa tttcctaagg ttgatgtata 



<210> 14 

<21l> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 14 

tcccaaagtg tactcaaaaa atgtaccaac ctgtatgact gctgagtccg 



<210> 15 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 



<400> 15 

caagtacgac agaatggcta gcaatacaag ggctaccaat ggttcttagt 50 

<210> 16 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
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Oligonucleotide 
<400> 16 

tggattattg gtactaatga aaaaaaagca cagatggaac tctataatga 50 

<210> 1-7 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 17 

aattgcgaca gaatatggtc atgatatatc taaaatagat cattgtatga 50 

<210> 18 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 18 

cttatatttg ttctgttgat gatgatgcac aaaaggcgca agatgtttgt 50 

<210> 19 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



50 



<400> 19 

cgggagtttc tgaaaaattg gtatgactca tatgtaaatg cgaccaatat 

<210> 20 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 20 

ctttaatgat agcaatcaaa ctcgtggtta tgattatcat aaaggtcaat 50 

<210> 21 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 21 

ggcgtgattt tgttttacaa ggacatacaa acaccaatcg acgtgttgat 50 

<210> 22 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 22 

tatagcaatg gtattaaccc tgtaggcact cctgagcagt gtattgaaat 50 

<210> 23 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 23 

cattcaacgt gatattgatg caacgggtat tacaaacatt acatgcggat 50 

<210> 24 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 24 

ttgaagctaa tggaactgaa gatgaaataa ttgcttccat gcgacgcttt 50 

<210> 25 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 25 

atgacacaag tcgctccttt cttaaaagaa cctaaataaa ttacttattt 50 



<210> 26 
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<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 26 

gatactagag ataataagga acaagttatg aaatttggat tattttttct 

<210> 27 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 27 

aaactttcag aaagatggaa taacatctga agaaacgttg gataatatgg 

<210> 28 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 28 

taaagactgt cacgttaatt gattcaacta aatatcattt taatactgcc 

<210> 29 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 29 

tttgttaatg aacatcactt ttcaaaaaat ggtattgttg gagcacctat 

<210> 30 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 30 
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taccgcagct ggttttttat tagggttaac 

<210> 31 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
Oligonucleotide 

<400> 31 

cattaaatca agtaattacc acccatcacc 



aaataaatta catattggtt 50 



Sequence: Synthetic 



ctgtacgtgt agcagaagaa 50 



<210> 32 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 32 

gccagtttat tagatcaaat gtcagaggga cgcttcattc ttggttttag 



<210> 33 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 33 

tgactgcgaa agtgatttcg aaatggaatt ttttagacgt catatctcat 



<210> 34 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 34 

caaggcaaca acaatttgaa gcatgctatg aaataattaa tgacgcatta 



<210> 35 
<211> 50 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
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01 igonucleotide 
<400> 35 

actacaggtt attgtcatcc ccaaaacgac ttttatgatt ttccaaaggt 

<210> 36 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 36 

ttcaattaat ccacactgtt acagtgagaa tggacctaag caatatgtat 

<210> 37 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 37 

ccgctacatc aaaagaagtc gtcatgtggg cagcgaaaaa ggcactgcct 

<210> 38 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; Synthetic 
Oligonucleotide 

<400> 38 

ttaacattta agtgggagga taatttagaa accaaagaac gctatgcaat 

<210> 39 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 39 

tctatataat aaaacagcac aacaatatgg tattgatatt tcggatgttg 

<210> 40 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 40 

atcatcaatt aactgtaatt gcgaacttaa atgctgatag aagtacggct 



<210> 41 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 41 

caagaagaag tgagagaata cttaaaagac tatatcactg aaacttaccc 

<210> 42 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonuc 1 eo t ide 

<400> 42 

tcaaatggac agagatgaaa aaattaactg cattattgaa gagaatgcag 



<210> 43 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 43 

ttgggtctca tgatgactat tatgaatcga caaaattagc agtggaaaaa 

<210> 44 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 

<400> 44 

acagggtcta aaaatatttt attatccttt gaatcaatgt ccgatattaa 



<210> 45 
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<211> 50 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 45 

agatgtaaaa gatattattg atatgttgaa ccaaaaaatc gaaatgaatt 50 

<210> 46 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 46 

taccataata aaattaaagg caatttctat attagattgc ctttttgggg 50 

<210> 47 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 47 

atcctctaga aatattttat ctgattaata agatgagaat tcactggccg 50 

<210> 48 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 



<400> 48 

tcgttttaca acgtcgtgac tgggaaaacc ctggcgttac ccaacttaat 

<210> 49 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



50 



<400> 49 
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cgccttgcag cacatccccc tttcgccagc tggcgtaata gcgaagaggc 50 



<210> 50 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 50 

ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatggc 



<210> 51 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 51 

gcctgatgcg gtattttctc cttacgcatc tgtgcggtat ttcacaccgc 



<210> 52 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 52 

atatggtgca ctctcagtac aatctgctct gatgccgcat agttaagcca 

<210> 53 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleot ide 

<400> 53 

gccccgacac ccgccaacac ccgctgacgc gccctgacgg gcttgtctgc 



<210> 54 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
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Oligonucleotide 
<400> 54 

tcccggcatc cgcttacaga caagctgtga ccgtctccgg gagctgcatg 50 

<210> 55 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 55 

tgtcagaggt tttcaccgtc atcaccgaaa cgcgcgagac gaaagggcct 50 

<210> 56 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



50 



<400> 56 

cgtgatacgc ctatttttat aggttaatgt catgataata atggtttctt 

<210> 57 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 57 

agacgtcagg tggcactttt cggggaaatg tgcgcggaac ccctatttgt 50 

<210> 58 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 58 

ttatttttct aaaaagcttc acgctgccgc aagcactcag ggcgcaaggg 

<210> 59 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



50 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 59 

ctgctaaagg aagcggaaca cgtagaaagc cagtccgcag aaacggtgct 

<210> 60 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 60 

gaccccggat gaatgtcagc tactgggcta tctggacaag ggaaaacgca 

<210> 61 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 61 

agcgcaaaga gaaagcaggt agcttgcagt gggcttacat ggcgatagct 

<210> 62 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 62 

agactgggcg gttttatgga cagcaagcga accggaattg ccagctgggg 

<210> 63 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 63 

cgccctctgg taaggttggg aagccctgca aagtaaactg gatggctttc 



<210> 64 
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<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 64 

ttgccgccaa ggatctgatg gcgcagggga tcaagatctg atcaagagac 50 

<210> 65 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 65 

aggatgagga tcgtttcgca tgattgaaca agatggattg cacgcaggtt 50 

<210> 66 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 66 

ctccggccgc ttgggtggag aggctattcg gctatgactg ggcacaacag 50 

<210> 67 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 67 

acaatcggct gctctgatgc cgccgtgttc cggctgtcag cgcaggggcg 50 

<210> 68 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 68 
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cccggttctt tttgtcaaga ccgacctgtc cggtgccctg aatgaactgc 



50 



<210> 69 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 69 

aggacgaggc agcgcggcta tcgtggctgg ccacgacggg cgttccttgc 



<210> 70 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 70 

gcagctgtgc tcgacgttgt cactgaagcg ggaagggact ggctgctatt 



<210> 71 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 71 

gggcgaagtg ccggggcagg atctcctgtc atctcacctt gctcctgccg 

<210> 72 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 72 

agaaagtatc catcatggct gatgcaatgc ggcggctgca tacgcttgat 



<210> 73 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 



WO 99/14318 



18 



PCT/US98/19312 



Oligonucleotide 
<400> 73 

ccggctacct gcccattcga ccaccaagcg aaacatcgca tcgagcgagc 50 

<210> 74 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 74 

acgtactcgg atggaagccg gtcttgtcga tcaggatgat ctggacgaag 50 

<210> 75 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 

<400> 75 

agcatcaggg gctcgcgcca gccgaactgt tcgccaggct caaggcgcgc 50 

<210> 76 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 76 

atgcccgacg gcgaggatct cgtcgtgacc catggcgatg cctgcttgcc 50 

<210> 77 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 77 

gaatatcatg gtggaaaatg gccgcttttc tggattcatc gactgtggcc 50 

<210> 78 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 78 

ggctgggtgt ggcggaccgc tatcaggaca tagcgttggc tacccgtgat 

<210> 79 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 

<400> 79 

attgctgaag agcttggcgg cgaatgggct gaccgcttcc tcgtgcttta 

<210> 80 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 80 

cggtatcgcc gctcccgatt cgcagcgcat cgccttctat cgccttcttg 

<210> 81 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 81 

acgagttctt ctgagcggga ctctggggtt cgaaatgacc gaccaagcga 

<210> 82 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 82 

cgcccaacct gccatcacga gatttcgatt ccaccgccgc cttctatgaa 
<210> 83 
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<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 83 

aggttgggct tcggaatcgt tttccgggac gccggctgga tgatcctcca 

<210> 84 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 84 

gcgcggggat ctcatgctgg agttcttcgc ccaccccggg catgaccaaa 

<210> 85 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 85 

atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa 

<210> 86 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 86 

gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct 

<210> 87 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 87 
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tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa 

<210> 88 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



50 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 88 

gagctaccaa ctctttttcc gaaggtaact ggcttcagca gagcgcagat 

<210> 89 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 89 

accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga 



<210> 90 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 90 

actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg 



<210> 91 

<211> 50 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 91 

gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg 



<210> 92 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
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Oligonucleotide 
<400> 92 

atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 

<210> 93 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 93 

cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag 

<210> 94 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 94 

cgtgagctat gagaaagcgc cacgcttccc gaagggagaa aggcggacag 

<210> 95 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 95 

gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 

<210> 96 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 96 

cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc 

<210> 97 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 97 

tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 

<210> 98 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 98 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact 

<210> 99 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 99 

ataaagatac caggcgtttc cccctggaag ctccctcgtg cgctctcctg 

<210> 100 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 100 

ttccgaccct gccgcttacc ggatacctgt ccgcctttct cccttcggga 

<210> 101 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucl eot ide 

<400> 101 

agcgtggcgc tttctcatag ctcacgctgt aggtatctca gttcggtgta 



<210> 102 
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<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 102 

ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 

<210> 103 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 103 

accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 

<210> 104 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 104 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc 

<210> 105 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 105 

gaggtatgta ggcggtgcta cagagttctt gaagtggtgg cctaactacg 

<210> 106 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleot ide 



<400> 106 
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gctacactag aaggacagta tttggtatct gcgctctgct gaagccagtt 

<210> 107 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 107 

accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc 

<210> 108 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 108 

tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 

<210> 109 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 109 

aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 

<210> 110 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 110 

tggaacgaaa actcacgtta agggattttg gtcatgcccg gggtgggcga 

<210> 111 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
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Oligonucleotide 
<400> 111 

agaactccag catgagatcc ccgcgctgga ggatcatcca gccggcgtcc 50 

<210> 112 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 112 

cggaaaacga ttccgaagcc caacctttca tagaaggcgg cggtggaatc 50 

<210> 113 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 113 

gaaatctcgt gatggcaggt tgggcgtcgc ttggtcggtc atttcgaacc 50 

<210> 114 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 114 

ccagagtccc gctcagaaga actcgtcaag aaggcgatag aaggcgatgc 50 

<210> 115 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 

<400> 115 

gctgcgaatc gggagcggcg ataccgtaaa gcacgaggaa gcggtcagcc 50 

<210> 116 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 116 

cattcgccgc caagctcttc agcaatatca cgggtagcca acgctatgtc 

<210> 117 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 117 

ctgatagcgg tccgccacac ccagccggcc acagtcgatg aatccagaaa 



<210> 118 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 118 

agcggccatt ttccaccatg atattcggca agcaggcatc gccatgggtc 

<210> 119 
<2ll> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 

<400> 119 

acgacgagat cctcgccgtc gggcatgcgc gccttgagcc tggcgaacag 

<210> 120 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 120 

ttcggctggc gcgagcccct gatgctcttc gtccagatca tcctgatcga 



<210> 121 
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<2ll> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 121 

caagaccggc ttccatccga gtacgtgctc gctcgatgcg atgtttcgct 50 

<210> 122 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 122 

tggtggtcga atgggcaggt agccggatca agcgtatgca gccgccgcat 50 

<210> 123 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleot ide 

<400> 123 

tgcatcagcc atgatggata ctttctcggc aggagcaagg tgagatgaca 50 

<210> 124 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 124 

ggagatcctg ccccggcact tcgcccaata gcagccagtc ccttcccgct 50 

<210> 125 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 125 
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tcagtgacaa cgtcgagcac agctgcgcaa ggaacgcccg tcgtggccag 



50 



<210> 126 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 126 

ccacgatagc cgcgctgcct cgtcctgcag ttcattcagg gcaccggaca 



<210> 127 
<211> 50 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 127 

ggtcggtctt gacaaaaaga accgggcgcc cctgcgctga cagccggaac 



<210> 128 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 128 

acggcggcat cagagcagcc gattgtctgt tgtgcccagt catagccgaa 

<210> 129 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 129 

tagcctctcc acccaagcgg ccggagaacc tgcgtgcaat ccatcttgtt 

<210> 130 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
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Oligonucleotide 
<400> 130 

caatcatgcg aaacgatcct catcctgtct cttgatcaga tcttgatccc 50 

<210> 131 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 131 

ctgcgccatc agatccttgg cggcaagaaa gccatccagt ttactttgca 50 

<210> 132 
<2ll> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 132 

gggcttccca accttaccag agggcgcccc agctggcaat tccggttcgc 50 

<210> 133 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 133 

ttgctgtcca taaaaccgcc cagtctagct atcgccatgt aagcccactg 50 

<210> 134 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucl eot ide 

<400> 134 

caagctacct gctttctctt tgcgcttgcg ttttcccttg tccagatagc 50 

<210> 135 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 135 

ccagtagctg acattcatcc ggggtcagca ccgtttctgc ggactggctt 

<210> 136 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucl eot ide 

<400> 136 

tctacgtgtt ccgcttcctt tagcagccct tgcgccctga gtgcttgcgg 

<210> 137 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 137 

cagcgtgaag ctttttagaa aaataaacaa ataggggttc cgcgcacatt 

<210> 138 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 138 

tccccgaaaa gtgccacctg acgtctaaga aaccattatt atcatgacat 

<210> 139 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleot ide 

<400> 139 

taacctataa aaataggcgt atcacgaggc cctttcgtct cgcgcgtttc 
<210> 140 
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<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 140 

ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg agacggtcac 

<210> 141 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 141 

agcttgtctg taagcggatg ccgggagcag acaagcccgt cagggcgcgt 

<210> 142 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 142 

cagcgggtgt tggcgggtgt cggggctggc ttaactatgc ggcatcagag 

<210> 143 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucl eot ide 

<400> 143 

cagattgtac tgagagtgca ccatatgcgg tgtgaaatac cgcacagatg 

<210> 144 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 144 
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cgtaaggaga aaataccgca tcaggcgcca ttcgccattc aggctgcgca 

<210> 145 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 

<400> 145 

actgttggga agggcgatcg gtgcgggcct cttcgctatt acgccagctg 

<210> 146 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 146 

gcgaaagggg gatgtgctgc aaggcgatta agttgggtaa cgccagggtt 

<210> 147 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 147 

ttcccagtca cgacgttgta aaacgacggc cagtgaattc tcatcttatt 

<210> 148 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 148 

aatcagataa aatatttcta gaggatcccc aaaaaggcaa tctaatatag 

<210> 149 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
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Oligonucleotide 
<400> 149 

aaattgcctt taattttatt atggtaaatt catttcgatt ttttggtt- 

<210> 150 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 150 

acatatcaat aatatctttt acatctttaa tatcggacat tgattcaaag 

<210> 151 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 151 

gataataaaa tatttttaga ccctgttttt tccactgcta attttgtcga 

<210> 152 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 152 

ttcataatag tcatcatgag acccaactgc attctcttca ataatgcagt 

<210> 153 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 153 

taattttttc atctctgtcc atttgagggt aagtttcagt gatatagtct 

<210> 154 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 1-54 

tttaagtatt ctctcacttc ttcttgagcc gtacttctat cagcatttaa 

<210> 155 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 155 

gttcgcaatt acagttaatt gatgatcaac atccgaaata tcaataccat 

<210> 156 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 156 

attgttgtgc tgttttatta tatagaattg catagcgttc tttggtttct 

<210> 157 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 157 

aaattatcct cccacttaaa tgttaaaggc agtgcctttt tcgctgccca 

<210> 158 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 158 

catgacgact tcttttgatg tagcggatac atattgctta ggtccattct 



<210> 159 
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<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 159 

cactgtaaca gtgtggatta attgaaacct ttggaaaatc ataaaagtcg 50 

<210> 160 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 160 

ttttggggat gacaataacc tgtagttaat gcgtcattaa ttatttcata so 

<210> 161 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Ol igonuc 1 eo t ide 

<400> 161 

gcatgcttca aattgttgtt gccttgatga gatatgacgt ctaaaaaatt 50 

<210> 162 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 162 

ccatttcgaa atcactttcg cagtcactaa aaccaagaat gaagcgtccc 50 

<210> 163 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 163 
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tctgacattt gatctaataa actggcttct tctgctacac gtacagggtg 

<210> 164 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 164 

atgggtggta attacttgat ttaatgaacc aatatgtaat ttatttgtta 

<210> 165 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 165 

accctaataa aaaaccagct gcggtaatag gtgctccaac aataccattt 

<210> 166 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonuc 1 eo t ide 

<400> 166 

tttgaaaagt gatgttcatt aacaaaggca gtattaaaat gatatttagt 

<210> 167 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 167 

tgaatcaatt aacgtgacag tctttaccat attatccaac gtttcttcag 

<210> 168 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
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Oligonucleotide 
<400> 168 

atgttattcc atctttctga aagtttagaa aaaataatcc aaatttcata 

<210> 169 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 169 

acttgttcct tattatctct agtatcaaat aagtaattta tttaggttct 

<210> 170 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 170 

tttaagaaag gagcgacttg tgtcataaag cgtcgcatgg aagcaattat 

<210> 171 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 171 

ttcatcttca gttccattag cttcaaatcc gcatgtaatg tttgtaatac 

<210> 172 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 172 

ccgttgcatc aatatcacgt tgaatgattt caatacactg ctcaggagtg 

<210> 173 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 173 

cctacagggt taataccatt gctataatca acacgtcgat tggtgtttgt 

<210> 174 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 174 

atgtccttgt aaaacaaaat cacgccattg acctttatga taatcataac 

<210> 175 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 175 

cacgagtttg attgctatca ttaaagatat tggtcgcatt tacatatgag 

<210> 176 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 176 

tcataccaat ttttcagaaa ctcccgacaa acatcttgcg ccttttgtgc 

<210> 177 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 177 

atcatcatca acagaacaaa tataagtcat acaatgatct attttagata 



<210> 178 
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<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 178 

tatcatgacc atattctgtc gcaatttcat tatagagttc catctgtgct 

<210> 179 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 179 

tttttttcat tagtaccaat aatccaacta agaaccattg gtagcccttg 

<210> 180 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 180 

tattgctagc cattctgtcg tacttgcgga ctcagcagtc atacaggttg 

<210> 181 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 181 

gtacattttt tgagtacact ttgggatata catcaacctt aggaaattga 

<210> 182 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 182 
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atgtaatcac tatcagagct aatggttcct gtctgtaagc tttccattat 



50 



<210> 183 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 183 

catctggtag aaattttgag taattgctcg agactcttcc atatcaacac 



<210> 184 
<211> 50 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 184 

caaatactcg aaaatcttta tggtatagcc ctcgaacggt tccaaaatta 



<210> 185 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 185 

aaacgacctt tcgacatttg atctaataat aaaacgtctt ctaactgtcg 

<210> 186 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleotide 

<400> 186 

aactgggtgt gctgtcggaa taacaacccc catagtgcca acatttaatg 

<210> 187 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
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Oligonucleotide 
<400> 187 

ttttagttct tcctaacagg ttagccgcag caacaaataa atttcccgta 

<210> 168 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 188 

agaccaaact ctgtaaaatg atgttctaag gtccaatatg tatcaaaccc 

<210> 189 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 



<400> 189 

tactcttctg aggcgatacc aagccgaaca aagcgatcca ttacttagct 

<210> 190 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
01 igonucleot ide 

<400> 190 

tatgagtttc acctggtggt tgatacgaaa aacaaatatt tccaaacttc 

<210> 191 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 191 

atactctatt cctttttggt gattctgttt atttaagcca attctaataa 

<210> 192 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 192 

ttcattttca atttcatttt ttaatctacg ctccttaaca gtaatacttg 

<210> 193 
<211> 50 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 193 

taacgtcctc aaatcgaggt aagcttcata ggctccgccc ccctgacgag 
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